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(S7) Abstract 

Provided herein are mediods and immunogenic compositions useful for protecting manmials from infection and pathology of P. 
gingivatis. Specifically, arginine-specific proteases of Porphyromonas gingivalis and peptides derived therefrom offer protection agahist 
infection. Inmiunogenic compositions comprising a 50 kDa arginine-speciHc protease, the high molecular weight complex or peptides from 
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IMHUNOGHKIC COMPOS ITIOKS COMPRISING 
P0RPHYROM02ViS GXNGIVALI3 PEPTIDES AMD METHODS 

STATEMENT RE FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT 
This invention was made, at least in part, with funding 
5 from the National Institutes of Health (Grant Nos. DE 09761, 
DE 09161, RR 03034, HL 26148 and HL 37090). Accordingly, the 
United States Government may have certain rights in this 
invention. 

BACKGROUND OF THE INVENTION 

10 The field of this invention is immunogenic compositions 

comprising bacterial proteases and/or peptides derived 
therefrom, more particularly those of Porphyromonas 
gingivalxs, most particularly the arginine -specific proteases 
and immunogenic compositions containing Arg-gingipains and/or 

15 peptides derived therefrom, and the lysine-specif ic proteases 
termed Lys-gingipains herein and immunogenic compositions 
containing Lys-gingipain (s) and/or peptides derived therefrom. 
Those immunogenic compositions are useful in the protection of 
a mammal, including a human, from infection and pathology 

20 caused by P. gingival is , 

Porphyromonas gingivalis {formerly Bacteroides 
gingivalis) is an obligately anaerobic bacterium which is 
implicated in periodontal disease. P. gingivalis produces 
several distinct proteolytic enzymes; its proteinases are 

25 recognized as important virulence factors, together with other 
factors such as lipopolysaccharide and a polysaccharide 
capsule, fimbriae, lectin- like adhesins, hyaluronidase, 
keratinase, superoxide dismutase and hemagglutinating and 
hemolyzing activities. A number of physiologically 

30 significant proteins, including collagen, fibronectin, 
immunoglobulins, complement factors C3 , C4, C5, and B, 
lysozyme, iron-binding proteins, plasma proteinase inhibitors, 
fibrin and fibrinogen, and factors of the plasma coagulation 
cascade system, are hydrolyzed by P. gringrivalis proteases. 

35 Broad proteolytic activity plays a role in the evasion of host 
defense mechanisms and the destruction of gingival connective 
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tissue in progressive periodontitis [Saglie ec al . (1988) J*. 
Periodontal. ia:259-265] . 

Progressive periodontitis is characterized by acute 
tissue degradation promoted by collagen digestion and a 
5 vigorous inflammatory response characterized by excessive 
neutrophil infiltration [White and Maynard (1981) J". 
Periodontal Res. i£: 259-265] . Gingival crevicular fluid 
accumulates in periodontitis as periodontal tissue erosion 
progresses at the foci of the infection, and numerous plasma 

10 proteins are exposed to proteinases expressed by the bacteria 
at the injury site. Neutrophils are recruited to the gingiva, 
in part, by the humoral chemotactic factor C5a, The 
complement components C3 and C5 are activated by complex 
plasma proteases with "trypsin-like" specificities called 

15 convertases [Muller-Eberhard (1988) Ann. Rev. Biochem. 

5Z:221''347] . The human plasma convertases cleave the a-chains 
of C3 and C5 at a specific site generating biologically active 
factors known as anaphy la toxins (i.e. C3a and C5a) . The 
anaphylatoxins are potent proinflammatory factors exhibiting 

20 chemotactic and/or spasmogenic activities as well as promoting 
increased vascular permeability. The larger products from C3 
and C5 cleavage (i.e. C3b and C5b) participate in functions 
including complement cascade activation, opsonization, and 
lytic complex formation. 

25 There are conflicting data as to the number and types of 

proteinases produced by P. gingivalis. In the past, 
proteolytic activities of P. gingivalis were classified into 
two groups; those enzymes which specifically degraded collagen 
and the general "trypsin- like" proteinases which appeared to 

30 be responsible for other proteolytic activity. Chen et al . 
(1992) J. Biol- Chem. 267 . 18896-18901 reported the first 
rigorous purification and biochemical characterization of an 
arginine- specific P. gingivalis protease; the purification of 
a lysine-specif ic proteinase of P. gingivalis is described by 

35 Pike et al. (1994) J. Biol. Chem. 2£2:406-411 [see also 
Potempa et al . (1995) Perspectives in Drug Discovery and 
Design 2:445-458] . 
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SUMMARY OF THE INVENTION 
An object of the present invention is to provide 
immunogenic compositions comprising at least one peptide 
corresponding in sequence to the N- terminus of at least one 
arginine -specific proteinase derived from P. gingivalis, 
preferably from Arg-gingipain, termed Arg-gingipain-1 (or RGP- 
1) , having an apparent molecular mass of 50 JcDa as estimated 
by sodium dodecyl sulfate polyacrylamide gel electrophoresis 
and an apparent molecular mass of 44 kDa as estimated by gel 
filtration chromatography, and enzymological properties as 
described hereinbelow. In a specifically exemplified RGP 
protein, the protein is characterized by an N- terminal amino 
acid sequence as given in SEQ ID N0:1 

(YTPVEEKQNGRMIVIVAKKYEGDIKDFVDWKNQR) and by a C-terminal amino 
acid sequence as given in SEQ ID NO: 2 (ELLR) . A second Arg- 
specific gingipain has an N- terminal sequence as given in SEQ 
ID NO: 24 (YTPVEEKENGRMIVIVAKKY) , it differs from the sequence 
as given in SEQ ID NO: 10 in that position 7 is Glu rather than 
Gin. 

Within the scope of the present invention are methods for 
protecting a mammal, including a human, from periodontitis 
and/or other pathology caused at least in part by P. 
gingivalis, said method comprising the step of administering 
to said mammal an immunogenic composition comprising at least 
one peptide corresponding in sequence to the amino -terminus of 
at least one of RGP-1, RGP-2, HMW RGP, or one or more peptides 
derived from one or more of the foregoing proteins or having 
amino acid sequence (s) taken from the amino acid sequence (s) 
of one or more of the foregoing proteins, wherein said peptide 
or protein, when used in an immunogenic composition in an 
animal, especially a mammal or human, confers protection 
against infection by and/or periodontitis caused at least in 
part by P. gingivitis. Preferred immunogenic compositions for 
protecting mammals (e.g., man) from P. gingivalis infection do 
not include a hemagglutinin protein or peptide. 

A further object of this invention are immunogenic 
compositions comprising an N-terminal peptide derived from the 

3 
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catalytic subunit of a high molecular weight Arg-gingipain 
(HMW RGP) , which comprises a proteolytic component essentially 
as described hereinabove and at least one hemagglutinin 
component. A nucleotide sequence encoding the HMW RGP complex 
5 polyprotein is given in SEQ ID NO: 5; nucleotides 94 9-6063 and 
the deduced amino acid sequence is given in SEQ ID NO: 6. As 
specifically exemplified, the mature HMW RGP has a 50 kDa 
protease component (same as RGP-1) having a complete deduced 
amino acid sequence as given in SEQ ID NO: 6 from amino acid 

10 228 through amino acid 719 or in SEQ ID N0:4, amino acids 228- 
719, HMW RGP further comprises at least one hemagglutinin 
component . The encoded RGP-heraagglutinin complex is 
transcribed as a prepolyprotein, with the amino acid sequence 
of at least one hemagglutinin protein as given in SEQ ID NO: 6 

15 from amino acid 720-1091, from 1092-1492 and/or from 1430- 
1704 . 

Compositions and immunogenic preparations including but 
not limited to vaccines, comprising at least one peptide 
antigen derived from the N-terminus of an Arg-gingipain from 

20 P. glngivalls and/or a peptide derived from an Arg-gingipain, 
and/or a Lys-gingipain and a suitable carrier therefor are 
provided. Such immunogenic compositions and vaccines are 
useful, for example, in immunizing an animal, including a 
human, against infection by and/or the inflammatory response 

25 and tissue damage caused by P. gingivalis in periodontal 

disease. The vaccine preparations comprise an immunogenic 
amount of an Arg-specific proteinase, Lys-gingipain, or an 
immunogenic peptide fragment or subunit of either one or both 
of said Arg-gingipains and Lys-gingipains or other P. 

30 gingivalis protease. Such vaccines may comprise one or more N- 
terminal peptides from Arg-gingipains and/or one or more Lys- 
gingipains and/or an Arg-gingipain or Lys-gingipain in 
combination with another protein or other immunogen. By 
"immunogenic amount" is meant an amount capable of eliciting 

35 the production of antibodies directed against one or more Arg- 
gingipain and/or Lys-gingipain catalytic subunit (or one or 
more peptides whose amino acid sequence is derived from the 



06/30/2003, EAST Version: 1.03.0002 



wo 97/34629 



PCT/US97/04635 



foregoing proteins) in an individual or animal to which the 
vaccine has been administered. 

Oligopeptides of the present invention include those of 
about 3 0 amino acids or less, and include those comprising 
5 sequences as given in SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, 
SEQ ID N0:12, SEQ ID N0:14, SEQ ID N0:16, SEQ ID N0:17, SEQ ID 
NO: 18, SEQ ID NO: 19, SEQ ID N0:20, SEQ ID N0:21, SEQ ID N0:23 
and SEQ ID NO: 24. These oligopeptides can be formulated into 
vaccine compositions which are effective in protecting an 
10 animal, including a human, from infection by P, gingivalxs and 
from periodontitis caused by P. gingivalis. 

BRIEF DESCRIPTION OF THE FIGURES 
Figure 1 illustrates the composite physical map of HMW 
RGP Arg-gingipain-2 DNA clones. The first codon of the mature 

15 gingipain is indicated. Clones PstI {1) /PstJ(2807) , 

fimal (13 91) /BamHI (3159) , and PstI (2807) /BajnHI (3159) are 
represented. The arrows indicate the extent and direction of 
sequencing. M13 primers and internal primers were used to 
sequence both strands of the putative HMW RGP gene, initially 

20 as double strand sequencing on clone PstI (1) /PstJ(2807) and 
then as single strand sequencing on PstI (1) /Pstl(2807) clone 
and on PstI (2807) /BajriHI (3159) clone in both directions. The 
junction PstI (2807) was sequenced on double stranded clone 
5inal (1391) /BainHI (3159) . Only restriction sites employed in 

25 cloning are indicated. 

Figure 2 presents a comparison of the polyprotein 
structures of HMW RGP and HMW KGP. Identical shading in the 
two diagrams indicates regions of amino acid sequence 
identity. 

30 Figure 3 provides a sequence comparison of enzymatically 

active components of . HMW KGP and HMW RGP polyproteins, with 
dashes inserted to optimize alignment of the two sequences. 

Figure 4 diagrammatically illustrates the structure of 
pro-gingipain Rl (RGP-1) , with indicated locations of peptides 

35 used for animal immunizations. The initial transcript of the 
rgfpi gene consists of propeptide, catalytic, and 

5 
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adhesin/hemagglutinin domains [Pavloff et al. (1995) J. Biol. 
Chem, 22il'. 1007J . During translocation onto the P. gingivalis 
surface, the polyprotein undergoes proteolytic processing, 
resulting in the formation of mature RGP-1, either in membrane 
5 bound or soluble forms consisting of a non-covalent complex of 
a catalytic polypeptide and fragments of the 
adhesin/hemagglutinin domain [Pike et al . (1994) J". Biol, 
Chem, 2£1:406] . The adhesin/hemagglutinin domain is divided 
into subdomains (HGPs) of 44, 15, 17, and 27 kDa, according to 

10 proteolytic processing after one Lys and 3 Arg residues 

(arrowheads) . The hemagglutination active site (Peptide D) is , 
a part of a triplicate amino acid sequence repeat present in 
the HGP44, HGP17 , and HGP27 subdomains. The triplicate 
repeats of 50 amino acid sequence within the 

15 adhesin/hemagglutinin domain are represented by hatched boxes 
numbered beneath the structure. RGP-2 is also translated as a 
proenzyme, nearly identical in sequence to the catalytic 
domain of RFP-1 but missing the entire adhesin/hemagglutinin 
domain. The structure of the Lys-gingipain polyprotein is 

20 similar to RGP-1, with the adhesin/hemagglutinin domain being 
virtually identical- The initial Lys-gingipain translation 
product is subject to posttranslational processing by Arg- 
gingipain{s) [Okamoto et al . (1996) J, Biochem. 12Ii:398] . The 
catalytic domains of both gingipains share only limited 

25 identity (27%) scattered throughout the polypeptide chain, 
except for an identical 30 amino acid residue fragment 
(Peptide C) . The cleavage of the propeptide which releases 
active RGPs is shown by an arrow. Arrowheads indicate 
putative proteolytic processing sites leading to assembly of 

30 the soluble or membrane -bound enzyme (95 kDa) in the form of a 
noncovalent complex of the catalytic domain with indicated, 
active fragments of the adhesin/hemagglutinin domain (HGP) . 

Figure 5 graphically illustrates the results of 
competitive ELISA. Chamber fluid from mice immunized with 

35 heat -killed P. gingivalis was preincubated with increasing 
concentration of RGP-1 (light bars) and KGP (dark bars) as 
competing antigens before the mixture was added to a 
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microtitration plate coated with whole P. gingivalis cells. 
The amount of antibody specifically bound to bacterial surface 
antigens was determined by subsequent binding of peroxidase - 
labeled goat anti-mouse IgG antibodies. 
5 Figures. 6A-6D illustrate Western-blot analyses of chamber 

fluid samples. Purified gingipains (RGP-1, RGP-2, and KGP) and 
samples of P. gingivalis vesicles and membranes were boiled, 
resolved by SDS-PAGE and transferred to nitrocellulose. The 
nitrocellulose was transiently stained with Ponceau S, the 

10 position of molecular weight markers (Pharmacia), RGP-2, and 
polypeptide chains constituting RGP-1 complex were marked 
(dots to the right of an appropriate lane) , and incubated in 
chamber fluid obtained from mice immunized with either: Fig. 
GA, the N-terminal peptide of the catalytic domain of RGPs 

15 (Peptide A) (1,000 fold dilution); Fig. 6B, RGP-1; Fig. 6C, 
(1,000 fold dilution), the peptide derived from the 
adhesin/hemagglutinin domain of RGP-1 (Peptide D) 100 fold 
dilution); Fig. 6D, heat killed P. gingivalis (1,000 fold 
dilution) or Fig. 6E, RGP-2 (1,000 fold dilution). Alkaline 

20 phosphatase- labeled goat ant i -mouse IgG was then added and 
blots were developed. 

DETAILED DESCRIPTION OF THE INVENTION 
Abbreviations used herein for amino acids are standard in 
the art : X or Xaa represents an amino acid residue that has 

25 not yet been identified but may be any amino acid residue 
including but not limited to phosphorylated tyrosine, 
threonine or serine, as well as cysteine or a glycosylated 
amino acid residue. The abbreviations for amino acid residues 
as used herein are as follows: A, Ala, alanine; V, Val, 

30 valine; L, Leu, leucine; I, lie, isoleucine; P, Pro, proline; 

F, Phe, phenylalanine; W, Trp, tryptophan; M, Met, methionine; 
Gf Gly# glycine; S, Ser, serine; T, Thr, threonine; C, Cys, 
cysteine; Y, Tyr, tyrosine; Asn, asparagine; Q, Gin, 
glutamine; D, Asp, aspartic acid; E, Glu, glutamic acid; iC, 

35 Lys, lysine; R, Arg, arginine; and H, His, histidine. Other 
abbreviations used herein include Bz, benzoyl; Cbz, 
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carboxybenzoyl ; pNA, p-nitroanilide ; MeO, methoxy; Sue, 
succinyl; OR, ornithyl; Pip, pipecolyl; SDS, sodium dodecyl 
sulfate; TLCK, tosyl-L- lysine chloromethyl ketone; TPCK, 
tosyl-L-phenylalanine chloromethyl ketone; S-2238, D-Phe-Pip- 
5 Arg-pNA, S-2222, Bz- Ile-Glu- ( y-OR) -Gly-pNA; S-2288, D-Ile-Pro- 
Arg-pNA; S-2251, D-Val-Leu-Lys-pNA; Bis-Tris, 2- [bis {2- 
hydroxyethyl) amino] -2- (hydroxymethyl) -propane -1, 3 -diol; FPLC, 
fast protein liquid chromatography; HPLC, high performance 
liquid chromatography; Tricine, N- [2-hydroxy-l, 1- 

10 bis (hydroxymethyl) ethyl] glycine; EGTA, [ethylene- 
bis (oxyethylene-nitrile) tetraacetic acid; EDTA, 
ethylenediamine-tetraacetic acid; 2-L-Lys-pNa, Z-L-Lysine-p- 
Nitroanilide; HMW, high molecular weight. 

Arg-gingipain (RGP) is the term given to a P. gingivalis 

15 enzyme with specificity for proteolytic and/or amidolytic 

activity for cleavage of a peptide and/or an amide bond, in 
which L-arginine contributes the carboxyl group. The Arg- 
gingipains described herein have identifying characteristics 
of cysteine dependence, inhibition response, Ca^*- 

20 stabilization and glycine stimulation. Particular forms of 
Arg-gingipain are distinguished by the apparent molecular 
masses of the mature proteins (as measured without boiling 
before SDS-PAGE) . See also Chen et al (1992) supra, Arg- 
gingipains of the present invention have no amidolytic or 

25 proteolytic activity for peptide and/or amide bonds in which 
L- lysine contributes the -COOH moiety. 

Antibodies specific for RGPs are produced in adult 
periodontitis patients, with the majority being reactive with 
cuitigenic determinants in the hemagglutinin/adhesin domain of 

30 RGP-1, [Curtiss et al . (1996) Jnfect. Xznmim. £4:2532] . 

Although patients with a history of destructive disease 
frequently demonstrate an elevated igG response to P. 
gingivalis, these antibodies are apparently ineffective at 
limiting continued disease progression (Turner et al. (1989) 

35 Microbios £a:133; Yoshimura et al. (1987) Microbiol. Immunol. 

il:935; Gunsolley et al . (1990) J, Periodontol . £1:412; Naito 
et al. (1987) Infect. Intmun. ^:832] . In several animal 
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Studies, induction of an immune response to certain components 
of P. gingivalis exacerbates disease [McArthur and Clark 
(1993) J. Periodontal. M:807] . Animal experiments described 
herein have demonstrated the protective effect of P, 
5 gringrivalis-specif ic antibodies produced against peptides 
derived from N- terminus of RGP-1 (Fig. 1) . 

Arg-gingipain (RGP-1) is the name given herein to a 
protein characterized as having a molecular mass of 50 kDa as 
measured by SDS-PAGE and 44 kDa as measured by gel filtration 

10 over Sephadex G-150, having amidolytic and/or proteolytic 

activity for substrates having L-Arg in the P^ position, i.e. 
on the N- terminal side of the peptide bond to be hydrolyzed, 
dependent on cysteine (or other thiol groups for full 
activity) , having sensitivity to cysteine protease group- 

15 specific inhibitors including E64, iodoacetamide, iodoacetic 
acid, and N-methylmaleimide , leupeptin, antipain, trans - 
epoxysuccinyl-L-leucylamido- (4-guanidino)butane, TLCK, TPCK, 
p-aminobenzamidine, N-chlorosuccinamide, and chelating agents 
including EDTA and EGTA, but being resistant to inhibition by 

20 human cystatin C, a2-macroglobulin, al-proteinase inhibitor, 
antithrombin III, o2-antiplasmin, serine protease group- 
specific inhibitors including diisopropylf luorophosphate, 
phenylmethyl sulfonyl fluoride and 3,4-diisochlorocoumarin. The 
amidolytic and/or proteolytic activities are stabilized by Ca^* 

25 and stimulated by glycine-containing peptides and glycine 

analogs. Arg-gingipain- 1 (RGP-1) is the 50 kDa protein whose 
purification and characterization was disclosed in Chen et al. 
(1992) supra and Wingrove et al . (1992) supra. " 

Arg-gingipain-2 (RGP-2) is a 50 kDa arginine-specif ic 

30 proteinase whose purification is first described hereinbelow. 
RGP-1 is distinguished from RGP-2 in that RGP-1 is not 
retained during chromatography over DE-52; RGP-2 is eluted 
from Whatman DE-52 with salt. A comparison of the primary 
structures of RGP-1 and RGP-2 is presented in Table 2. 

35 An exemplified Arg-gingipain termed HMW RGP herein has an 

apparent molecular mass of 95 kDa as determined by SDS-PAGE 
without boiling of samples. When boiled, it dissociates into 

9 
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components of 50 kDa, 43 kDa, 27 kDa and 17 kDa. Arg- 
gingipain-l (RGP-1) is the name given to the 50 kDa, 
enzymatically active component of the high molecular weight 
complex. 

5 The complete amino acid sequence of the exemplified 

mature RGP-1 is given in SEQ ID NO: 6, from amino acids 228- 
719. A second exemplary amino acid sequence is given in SEQ 
ID N0:4, amino acids 1 through 510. The complete coding 
sequence for the HMW RGP precursor polyprotein is given in SEQ 

10 ID NO:5, nucleotides 949-6063. In nature these proteins are 
produced by Porphyromonas gingival is; they can be purified 
from cells or from culture supernatant using the methods 
provided herein. These proteins can also be produced 
recombinantly in suitable host cells genetically engineered to 

15 contain and express the exemplified (or synonymous) coding 
sequences . 

As used herein with respect to RGP-1 or RGP- 2, a 
substantially pure Arg-gingipain preparation means that there 
is only one protein band visible after silver-staining an SDS 

20 polyacrylamide gel run with the preparation, and the only 
amidolytic and/or proteolytic activities are those with 
specificity for L-arginine in the P^ position relative to the 
bond cleaved. A substantially pure high molecular weight Arg- 
gingipain preparation has only one band (95 kDa) on SDS-PAGE 

25 {sample not boiled) or four bands (50 kDa, 43 kDa, 27 kDa, 17 
kDa; sample boiled) . Using a higher resolution tricine SDS- 
PAGE system, an additional component of 19kDa has been 
detected in HMW RGP [Pavloff et al. (1995) supral . No 
amidolytic or proteolytic activity for substrates with lysine 

30 in the P^ position is evident in a substantially pure HMW RGP. 
Substantially pure Arg-gingipain is substantially free of 
naturally associated components when separated from the native 
contaminants which accompany them in their natural state. 
Thus, Arg-gingipain that is chemically synthesized or 

35 recombinantly synthesized in a cellular system different from 
the cell from which it naturally originates will be 
substantially free from its naturally associated components. 
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Techniques for chemical synthesis of polypeptides are 
described, for example, in Merrifield (1963) J. Amer, Chem. 
Soc. ^:2149-2156. A chemically synthesized Arg-gingipain 
protein or peptide derived therefrom is considered an 
5 "isolated" polypeptide or peptide. 

Recombinant ly produced RGP-1 and HMW RGP can be obtained 
by culturing host cells genetically engineered to contain and 
express the non-naturally occurring (recombinant) 
polynucleotides comprising nucleotide sequences encoding an 
10 Arg-gingipain as described herein under conditions suitable to 
attain expression of the proteinase-encoding sequence. See, 
e.g., U.S. Patent No. 5,523,390, incorporated by reference 
herein. 

Example 1 below and U.S. Patent No. 5,523,3 90 describe 

15 the purification of a 50 kDa RGP-1 and HMW RGP from P. 

gingivalxs culture supernatant, i.e., from a natural source. 
Various methods for the isolation of an Arg-gingipain from 
other biological material, such as from nonexemplif ied strains 
of P. gingivalis or from cells transformed with recombinant 

20 polynucleotides encoding such proteins, may be accomplished by 
methods known in the art . Various methods of protein 
purification are known in the art, including those described, 
e.g.. in Guide to Protein Purif icati nn . ed. Deutscher, Vol. 
iaZ of Mf^rhodfl in Enz ymology (Academic Press, Inc., San Diego, 

25 1990) and Scopes, Protein Purification! Principles and 
Practice (Springer- Ver lag, New York, 1982) . 

Further analysis of the high molecular weight fractions 
containing Arg-specific amidolytic and proteolytic activity 
revealed that HMW RGP contained proteins of 44 kDa, 

30 subsequently identified as a hemagglutinin, and 27 kDa and 17 
kDa, which are also postulated to have hemagglutinating 
activity. The empirically determined N- terminal amino acid 
sequence of the ccmplexed 44 kDa protein corresponds to amino 
acids 720-736 of SEQ ID NO: 6. 

35 Purified RGP-1 exhibits an apparent molecular mass of 

about 50 kDa as determined by SDS-polyacrylamide gel 
electrophoresis. The size estimate obtained by gel filtration 
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on high resolution agarose (Superose 12, Pharmacia, 
Piscataway, NJ) is 44 kDa. N-terminal sequence analysis 
through 43 residues gave a unique structure which showed no 
homology with any other proteins, based on a comparison in the 
5 protein NBRS data base, release 3 9.0. The sequence obtained 
is as follows: YTPVEEKQNGRMIVIVAKKYEGDIKDFVDWKNQR (SEQ ID 
N0:1). The C-terminal amino acid sequence of the gingipain-1 
(major form recognized in zymography SDS-PAGE, 0,1% gelatin in 
gel), was found to be ELLR (SEQ ID NO: 2). This corresponds to 

10 the amino acids encoded at nucleotides 3094-3105 in SEQ ID 

N0:3 and nucleotides 3094-3105 in SEQ ID N0:5, consistent with 
autoproteolytic processing of the precursor polyprotein to 
produce the mature 50 kDa RGP-1 protein. Without wishing to 
be bound by theory, it is proposed that SEQ ID NO: 3 comprises 

15 the coding sequence for RGP-l, the enzymatically active 

component of the high molecular weight form of Arg-gingipain. 
This is consistent with the observation that there are at 
least two genes with substantial nucleic acid homology to the 
Arg-gingipain-specif ic probe. 

20 Because progressive periodontitis is characterized by 

tissue degradation, collagen destruction and a strong 
inflammatory response, and because P. gingivalis exhibits 
complement-hydrolyzing activity, purified RGP-1 was tested for 
proteinase activity using purified human complement C3 and C5 

25 as substrates [see Wingrove et al. (1992) J. Biol. Chem. 

2i£:18902-18907] . RGP-1 selectively cleaved the C3 o-chain. 
C3a biological activity in the C3 digestion mixture was not 
observed, and the C3a-like fragment released from the a-chain 
was extensively degraded by RGP-1. When human C5 is subjected 

30 to prolonged digestion by RGP-1, functional C5a accumulates in 
the digestion mixture. RGP-1 injected into guinea pig skin 
enhances vascular permeability at concentrations greater than 
10'^ M and causes neutrophil accumulation at the site of 
injection. This activity was dependent on proteolytic 

35 activity of the RGP-1 protein. The results demonstrate the 
ability of RGP-1 to elicit an inflammatory response. 

12 



06/30/2003, EAST Version: 1.03.0002 



wo 97/34629 



PCT/US97/0463S 



The N- terminal amino acid sequence of the 50 kDa 
component of the HMW RGP is identical to the first 22 amino 
acids of the 50 kDa RGP-1. Characterization of the HMW RGP 
activity showed the same dependence on cysteine (or other 
5 thiols) and the same spectrum of response to potential 
inhibitors. Although the HMW RGP and RGP-1 amidolytic 
activity was stimulated by Gly-Gly, the response for RGP-2 was 
only about half that observed for RGP-1 and HMW RGP. 

The cloning and coding sequences for Arg-gingipain are 

10 described in United States Patent No. 5,523,390. SEQ ID NO: 3 
herein is the DNA sequence of the 3159 bp Pstl/BairMI fragment 
from P. gingivalls strain HG66 (W83) . An exemplified sequence 
encoding mature RGP-1 extends from 1630-3105. The first 
nucleotide belongs to the PstI cloning site. The first ATG 

15 appears at nucleotide 94 9 and is followed by a long open 

reading frame (ORF) of 2210 nucleotides. The first ATG is 
following by 8 others in frame (at nucleotides 1006, 1099, 
1192, 1246, 1315, 1321, 1603, and 1609). Which of these 
initiation codons are used in translation of the Arg- 

20 gingipain-2 precursor can be determined by expression of the 
polyprotein in bacteria and subsequent N-terminal sequence 
analysis of preprotein intermediates . The primary structure 
of the mature Arg-gingipain is derived from the empirical N- 
terminal and C- terminal sequences and molecular mass. Thus, a 

25 mature RGP has an amino terminus starting at nucleotide 

residue 1630 in SEQ ID NO: 3 and at amino acid 228 in SEQ ID 
NO: 4; both mature proteins are cleaved after an Arg. The 50 
kDa and the 44 kDa bands from Bz-li-Arg-pNa activity" peaks are 
identical in sequence to the deduced amino acid secjuence of 

30 gingipain, encoded respectively at nucleotides 1630-1695 and 
at nucleotides 3106-3156. The carboxyl terminus is most 
likely derived from autoproteolytic processing after the Arg 
residue encoded at 3103-3105 where the coding sequence of 
hemagglutinin starts (nucleotide 3106) . The deduced 492 amino 

35 acids of RGP-1 give rise to a protease molecule with a 

calculated molecular weight of 54 kDa, which correlates well 
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with the molecular mass of 50 kDa determined by SDS-PAGE 
analysis . 

The skilled artisan recognizes that other P. gingivalis 
strains can have coding sequences for a protein with the 
5 distinguishing characteristics of an Arg-gingipain; those 

coding sequences may be identical to or synonymous with the 
exemplified coding sequence, or there may be some variation (s) 
in the encoded amino acid sequence. An Arg-gingipain coding 
sequence from a P. gingivalis strain other than H66 can be 

10 identified by, e.g. hybridization to a polynucleotide or an 
oligonucleotide having the whole or a portion of the 
exemplified coding sequence for mature gingipain, under 
stringency conditions appropriate to detect a sequence of at 
least 70% homology. 

15 SEQ ID NO: 5 presents the nucleotide sequence encoding the 

complete prepolyprotein sequence, including both the protease 
component and the hemagglutinin component (s) of HMW RGP. The 
coding sequence extends from an ATG at nucleotide 94 9 through 
a TAG stop codon ending at nucleotide 6063 in SEQ ID NO: 5. 

20 The deduced amino acid sequence is given in SEQ ID NO: 6, 

Cleavage of the precursor protein after the Arg residue at 227 
amino acid residues into the precursor protein removes the N- 
terminal precursor portion and after the Arg residue at amino 
acid 719 releases a low molecular weight Arg-gingipain 

25 catalytic component and at least one hemagglutinin component. 
The cloning and sequencing of the lysine -specific 
gingipain (KGP) is described in United States Patent No. 
5,475,077, which is incorporated by reference herein. The 
coding sequence of the 60 kDa active component of the Lys- 

30 gingipain complex extends through nucleotide 2863 in SEQ ID 

NO: 7. The amino acid sequence identical to the amino- terminal 
sequence of the 44, 27 and 17 kDa Lys-gingipain complex 
components, at least one of which is believed to function as a 
hemagglutinin, is encoded at nucleotides 2864-2938 in SEQ ID 

35 NO: 7. Without wishing to be bound by any particular theory, 
it is believed that an Arg-specific protease processes the 
polyprotein which is (in part) encoded within the nucleotide 
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sequence of SEQ ID NO: 7. The predicted molecular mass of 55.9 
kDa for a 509 amino acid protein encoded from nucleotides 
1336-2863 is consistent with the empirically determined 
estimate of 6 0 kDa (SDS-PAGE) . 
5 Both HMW KGP (see U.S. Patent No. 5,475,077), and HMW RGP 

can to erythrocytes, laminin and fibrinogen even if the 
catalytic domains are inactivated. However, TLCK- inactivated 
50 kDa RGP cannot bind although the active form can degrade 
fibrinogen, fibronectin and laminin. Without wishing to be 

10 bound by theory, it is postulated that three nearly identical 
repeated sequences of HMW KGP and HMW RGP mediate this 
adhesion. Polyclonal antibodies have been made in response to 
a chemically synthesized peptide encompassing the repeated 
sequence (YTYTVYRDGKIKEGLTATTEDDGVATG-NHEYCVEKYTAGSVSPKVC) 

15 (SEQ ID NO: 9), which is close to a consensus sequence for the 
three repeating domains of HMW RGP and HMW KGP. These 
antibodies do not affect the catalytic activities of these 
proteases . 

An Arg-gingipain coding sequence was also isolated from 
20 P. gingivalis W50 . A 3.5 kb BaraHI fragment was sequenced; it 
exhibited 99% nucleotide sequence identity with the 3159 bp 
fragment of P. gringivalis W83 (HG66) DNA containing Arg- 
gingipain coding sequence. A comparison of the deduced amino 
acid sequences of the encoded Arg-gingipains revealed 99.9% 
25 identity. 

Regardless of the affinity for Arg-Sepharose and the 
differences in specific activities, the purified form of RGP-2 
gave in SDS-PAGE a single band with molecular mass of 48.5 
kDa, slightly lower than for the catalytic domain of HMW RGP 
30 (50.0 kDa). It is also slightly lower than for RGP-1, where 
the molecular mass was refined using laser densitometry 
scanning of the gel to 49.0 kDa from the previously reported 
50 kDa. 

In contrast to the uniform molecular mass, analysis of 
35 the purified forms of RGP-2 by means of zymography on gelatin 
SDS-PAGE revealed reciprocal heterogeneity in active band 
patterns and substantial differences in an electrophoretic 

15 
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mobility in comparison to RGP-1. The major activity zone of 
the latter gingipain was located in the 68-70 kDa area of the 
gel and did not have equivalent neither in starting material 
nor in the activity peaks separated by gel filtration 
5 chromatography. This indicates that the contribution of RGP-1 
to the total proteolytic activity of P. gingivalis H66 is 
relatively minor, a conclusion which is in keeping with the 
low activity against Bz-L-Arg-pNA recovered in Vo of the DE-52 
(3 00 activity units) as compared to the activity elated from 

10 the column with NaCl (5,819 activity units). 

Partial primary structure analyses of the 48.5-50 kDa 
forms of Arg-specific gingipain show that the amino-termini of 
three forms of RGP-2, which have been sequenced up to 50 amino 
acid residues and with one exception, Glu9 instead Gln9, have 

15 identical primary structures (RGP-1 and the catalytic domain 
of HMW RGP) . To further characterize possible structural 
differences between the Arg-Sepharose affinity variants of 
RGP-2 and RGP-1, a sample of each enzyme was 

S-ethylpyridylated and subjected to autodigestior or trypsin 

20 digestion. Due to the RGPs' strict specificity for Arg-X 

peptide bonds, autodigestion resulted in a discrete peptide 
band pattern with relatively high molecular masses within the 
range from 3 kDa to 27 kDa. The pattern was identical for the 
affinity variants of RGP-2, but it showed some differences in 

25 comparison to RGP-1, despite striking similarities of the 
overall peptide maps. 

The structures of RGP-2 variants was further investigated 
by reverse phase HPLC (C18 column) after tryptic digestion of 
the S-pyridylethylated proteins. Exactly the same peptide 

3 0 maps were again obtained, indicating that at the primary 

structure level, the Arg-Sepharose affinity variants of RGP-2 
are indistinguishable. In contrast, the peptide map of RGP-l 
differs slightly from that of RGP-2. Several HPLC-purif ied 
tryptic peptides derived from RGP-1 and RGP-2 have been 

35 subjected to amino- terminal sequence analyses and in both 
cases, the same sequence overlapping with the following 
fragments of the catalytic domain of HMW RGP as inferred from 
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DNA structure: 61-Gln-80-Lys, 92-Ser-112-Arg, 142-Trp-184-Lys, 
194-Asn-230-Lys. In one case, however, the peptide of RGP-2 
which did not have an equivalent in the reverse phase HPLC 
peptide map of RGP-1 gave unique, though related, sequence, 
5 that differed from the latter one in 13 out of 29 compared 

amino acid residues (Table 2) . Although RGP-1 and RGP-2 are 
closely related proteins, they differ in primary structure and 
therefore must be the products of different genes. 

SEQ ID NO: 3 and SEQ ID NO: 5 both represent sequences from 

10 P. gingivalis. However, it is understood that there will be 
some variations in the amino acid sequences and encoding 
nucleic acid sequences for Arg-gingipains from different P. 
gingivalis strains. The ordinary skilled artisan can readily 
identify and isolate Arg-gingipain-encoding sequences from 

15 other strains where there is at least 70% homology to the 

specifically exemplified sequences herein using the sequences 
provided herein taken with what is well known to the art, 
e.g., polymerase chain reaction and/or nucleic acid 
hybridization techniques. Also within the scope of the 

20 present invention are Arg-gingipain where the protease (or 
proteolytic component) has at least about 85% amino acid 
sequence identity with an amino acid sequence exemplified 
herein . 

It is also understood by the skilled artisan that there 
25 can be limited numbers of amino acid substitutions in a 

protein without significantly affecting function, and that 
nonexemplif ied gingipain-1 proteins can have some amino acid 
sequence diversion from the exemplified amino acid sequence. 
Such naturally occurring variants can be identified, e.g., by 
30 hybridization to the exemplified (mature) RGP-1 or HMW RGP 
coding sequence (or a portion thereof capable of specific 
hybridization to Arg-:gingipain sequences) under conditions 
appropriate to detect at least about 70% nucleotide sequence 
homology, preferably about 80%, more preferably about 90% and 
35 most preferably 95-100% sequence homology. Preferably the 

encoded Arg-gingipain protease or proteolytic component has at 
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least about 85% amino acid sequence identity to an exemplified 
Arg-gingipain amino acid sequence. 

It is well known in the biological arts that certain 
amino acid substitutions can be made in protein sequences 
5 without affecting the function of the protein. Generally, 
conservative amino acids are tolerated without affecting 
protein function. Similar amino acids can be those that are 
similar in size and/or charge properties, for example, 
aspartate and glutamate and isoleucine and valine are both 

10 pairs of similar amino acids. Similarity between amino acid 
pairs has been assessed in the art in a number of ways. For 
example, Dayhoff et al. (1978) in Atlas of Protein Sequence 
yipd fitru nrure. Volume 5, Supplement 3, Chapter 22, pages 345- 
352, which is incorporated by reference herein, provides 

15 frequency tables for amino acid substitutions which can be 
employed as a measure of amino acid similarity. Dayhoff et 
al.'s frequency tables are based on comparisons of amino acid 
sequences for proteins having the same function from a variety 
of evolutionarily different sources, 

20 In another embodiment of the present invention, 

polyclonal and/or monoclonal antibodies capable of 
specifically binding to a proteinase or fragments thereof are 
provided. The term antibody is used to refer both to a 
homogenous molecular entity, or a mixture such as a serum 

25 product made up of a plurality of different molecular 

entities. Monoclonal or polyclonal antibodies specifically 
reacting with the Arg-gingipains can be made by methods known 
in the art. See, e.g., Harlow and Lane (1988) Antibodies: A 
T.ahnrat . g-ry Manual . Cold Spring Harbor Laboratories ; Coding 

30 (1986) Mnnnrlonal Antibodies: Prinrriplea and Practice. 2d ed., 

Academic Press, New York; and Ausubel et al . (1987) vid^ 
infra. Also, recombinant immunoglobulins may be produc i by 
methods known in the art, including but not limited to, che 
methods described in U.S. Patent No. 4,816,567. Monoclonal 

35 antibodies with affinities of 10* M"\ preferably 10^ to 10^** or 
more are preferred. 
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Antibodies specific for Arg-gingipains are useful, for 
example, as probes for screening DNA expression libraries or 
for detecting the presence of Arg-gingipains in a test sample. 
Frequently, the polypeptides and antibodies will be labeled by 
5 joining, either covalently or noncovalently, a substance which 
provides a detectable signal. Suitable labels include but are 
not limited to radionuclides, enzymes, substrates, cof actors, 
inhibitors, fluorescent agents, chemiluminescent agents, 
magnetic particles and the like. United States Patents 

10 describing the use of such labels include, but are not limited 
to, Nos. 3,817,837; 3,850,752; 3,939,350; 3,995,345; 
4,277,437; 4,275,149; and 4,366,241. 

Antibodies specific for Arg-gingipain (s) and capable of 
inhibiting its proteinase activity are useful in treating 

15 animals, including man, suffering from periodontal disease. 

Such antibodies can be obtained by the methods described above 
and subsequently screening the Arg-gingipain-specif ic 
antibodies for their ability to inhibit proteinase activity. 
Compositions and immunogenic preparations, including 

20 vaccine compositions, comprising stibstantially purified 

recombinant Arg-gingipain (s) or an immunogenic peptide of an 
Arg-gingipain capable of inducing protective immunity in a 
suitably treated mammal and a suitable carrier therefor are 
provided. Alternatively, hydrophilic regions of the 

25 proteolytic component or hemagglutinin component (s) of Arg- 
gingipain can be identified by the skilled artisan, and 
peptide antigens can be synthesized and conjugated to a 
suitcQ>le carrier protein (e.g., bovine serum albumin or 
keyhole limpet hemocyanin) if needed for use in vaccines or in 

30 raising antibody specific for Arg-gingipains. Immunogenic 
compositions are those which result in specific antibody 
production when injected into a human or an animal. Such 
immunogenic compositions or vaccines are useful, for example, 
in immunizing an animal, including humans, against infection 

35 and/or inflammatory response and tissue damage caused by P. 

gingivalis in periodontal disease. The vaccine preparations 
comprise an immunogenic amount of one or more Arg-gingipains 
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or an immunogenic fragment (s) or subunit(s) thereof. Such 
vaccines can comprise one or more Arg-gingipains or in 
combination with another protein or other immunogen, or an 
epitopic peptide derived therefrom. A preferred peptide has 
5 an amino acid sequence identical to the N- terminal sequence of 
RGP-1. An "immunogenic amount" means an amount capable of 
eliciting the production of antibodies directed against Arg- 
gingipain(s) in an individual to which the vaccine has been 
administered. 

10 Immunogenic carriers can be used to enhance the 

imraunogenicity of the proteinases, proteolytic components, 
hemagglutinins or peptides derived in sequence from any of the 
foregoing. Such carriers include but are not limited to 
proteins and polysaccharides, liposomes, and bacterial cells 

15 and membranes. Protein carriers may be joined to the 

proteinases or peptides derived therefrom to form fusion 
proteins by recombinant or synthetic means or by chemical 
coupling. Useful carriers and means of coupling such carriers 
to polypeptide antigens are known in the art. 

20 The immunogenic compositions and/or vaccines may be 

formulated by any of the means known in the art . They are 
typically prepared as injectables, either as liquid solutions 
or suspensions. Solid forms suitable for solution in, or 
suspension in, liquid prior to injection may also be prepared. 

25 The preparation may also, for example, be emulsified, or the 
protein (s) /peptide (s) encapsulated in liposomes. Where 
mucosal immunity is desired, the immunogenic compositions 
advantageously contain an adjuvant such as the nontoxic 
cholera toxin B subunit (see, e.g.. United States Patent No. 

30 5,462,734). Cholera toxin B subunit is commerically 

available, for example, from Sigma Chemical Company, St- 
Louis, MO. Other suitable adjuvants are available and may be 
sxibstituted therefor. It is preferred that an adjuvant for an 
aerosol imm^unogenic (or vaccine) formulation is able to bind 

3 5 to epithelial cells and stimulate mucosal immunity. 

Among the adjuvants suitable for mucosal administration 
and for stimulating mucosal immunity are organometallopolymers 
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including linear, branched or cross- linked silicones which are 
bonded at the ends or along the length of the polymers to the 
particle or its core. Such polysiloxanes can vary in 
molecular weight from about 400 up to about 1,000,000 daltons; 
5 the preferred length range is from about 700 to about 60,000 
daltons. Suitable functionalized silicones include 
(trialkoxysilyl) alkyl- terminated polydialkylsiloxanes and 
trialkoxysilyl-terminated polydialkylsiloxanes, or example, 3- 
(triethyoxysilyl) propyl- terminated polydimethylsiloxane . See 

10 United States Patent No, 5,571,531, incorporated by reference 
herein. Phosphazene polyelectrolytes can also be incorporated 
into immunogenic compositions for transmucosal administration 
(intranasal, vaginal, rectal, respiratory system by aerosol 
administration) (See United States Patent No. 5,562,909). 

15 Alternatively, mucosal immunity can be triggered by the 

administration to mucosal surfaces, for example, orally, of 
recombinant avirulent bacterial cells which express a 
protective epitope derived from a P. gingxvalls protease, for 
example, RGP-1, HMW RGP or RGP-2, of particular interest is 

20 the expression of at least about 15 amino acids from the N- 

terminus of the RGP-2 or the N-terminus of a catalytic subunit 
of HMW RGP or HMW KGP. Avirulent Salmonella typhi and 
avirulent 5aIjnoneiIa typhimurium strains, suitcU^le vectors and 
suitable promoters for driving expression are known to the 

25 art. The protective epitopes are advantageously expressed as 
fusions with other proteins, such as Salmonella flagellin, 
tetanus toxin fragment C, and E. coli LamB or MalE. 

The active immunogenic ingredients are often mixed with 
excipients or carriers which are pharmaceutically acceptable 

3 0 and compatible with the active ingredient. Suitable 

excipients include but are not limited to water, saline, 
dextrose, glycerol, ethanol, or the like and combinations 
thereof. The concentration of the immunogenic polypeptide in 
injectable formulations is usually in the range of 0.2 to 5 

35 mg/ml. 

In addition, if desired, the vaccines may contain minor 
amounts of aiixiliary substances such as wetting or emulsifying 

21 
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agents, pH buffering agents, and/or adjuvants which enhance 
the effectiveness of the vaccine. Examples of adjuvants which 
are effective include, but are not limited to, aluminum 
hydroxide ; N - acetyl - muramy 1 - L - 1 hreonyl - D - i soglu tamine ( t hr - 
5 MDP) ; N-acetyl-nor-muramyl-lj-alanyl-D-isoglutamine (CGP 11637, 
referred to as nor-MDP) ; N-acetylmuramyl-L-alanyl-D- 
isoglutaminyl-L-alanine-2- (1 ' -2 ' -dipalmitoyl-sn-glycero- 
3hydroxyphosphoryloxy) -ethylamine (CGP 19835A, referred to as 
MTP-PE) ; and RIBI, which contains three components extracted 

10 from bacteria, monophosphoryl lipid A, trehalose dimycolate 

and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween 80 
emulsion. The effectiveness of an adjuvant may be determined 
by measuring the amount of antibodies directed against the 
immunogen resulting from administration of the immunogen in 

15 vaccines which are also comprised of the various adjuvants. 

Such additional formulations and modes of administration as 
are known in the art may also be used. 

RGP-1 and/or RGP-2 or HMW RGP and/or epi topic fragments 
or peptides of sequences derived therefrom or from other P. 

20 gingivalis proteins having primary structure similar (more 
than 90% identity) to HMW RGP or HMW KGP may be formulated 
into vaccines as neutral or salt forms. Pharmaceutically 
acceptable salts include, but are not limited to, the acid 
addition salts (fomned with free amino groups of the peptide) 

25 which are formed with inorganic acids, e.g., hydrochloric acid 
or phosphoric acids; and organic acids, e.g., acetic, oxalic, 
tartaric, or maleic acid. Salts formed with the free carboxyl 
groups may also be derived from inorganic bases, e.g., sodium, 
potassium, ammonium, calcium, or ferric hydroxides, and 

30 organic bases, e.g., isopropyl amine, trimethylamine , 2- 
ethylamino-ethanol, histidine, and procaine. 

The immiinogenic compositions or vaccines are administered 
in a manner compatible with the dosage formulation, and in 
such amount as prophylactically and/or therapeutically 

3 5 effective. The quantity to be administered, generally in the 
range of about 100 to 1,000 of protein per dose, more 
generally in the range of about 5 to 500 /ig of protein per 



06/30/2003, EAST Version: 1.03.0002 



wo 97/34629 



PCT/US97/04635 



dose, depends on the subject to be treated, the capacity of 
the individual's immune system to synthesize antibodies, and 
the degree of protection desired. Precise amounts of the 
immunogen may depend on the judgment of the physician or 
5 dentist and may be peculiar to each individual, but such a 
determination is within the skill of such a practitioner. 

The vaccine or other immunogenic composition can be given 
in a single dose or multiple dose schedule. A multiple dose 
schedule is one in which a primary course of vaccination may 

10 include 1 to 10 or more separate doses, followed by other 

doses administered at subsequent time intervals as required to 
maintain and or reinforce the immxine response, e.g., at 1 to 4 
months for a second dose, and if needed, a subsec[uent dose(s) 
after several months . 

15 When mice were immunized (see Example 8) and subsequently 

challenged with live P. gingivalis in the subcutaneous {SO 
chamber model for growth and invasion of P. gingivcilis, there 
was significant protection against infection where the 
experimental animals were immunized with heat -killed whole 

20 cells of P. gingivalis, RGP-2, HMW RGP, and peptides derived 
from the catalytic domain or N- terminus of a 50 kDa Arg- 
gingipain or an adhesin domain of HMW RGP, with infection 
being measured by recovery of viable P. gingivalis from the SC 
chambers (See Example 8, Table 4) . 

25 All control (unimmunized) mice yielded vic±>le bacteria 

during the course of infection. When mice were immunized with 
heat -killed P. gingivalis A743 6 whole cells, HMW RGP, RGP-2 or 
Peptide A (N- terminal sequence of catalytic subunit of HMW 
RGP, SEQ ID NO: 10), no viable bacteria were recovered at day 

30 7. Partial protection was afforded by Peptide B, the 

catalytic domain peptide (SEQ ID NO: 11) and by Peptide C, the 
hemagglutinin domain of HMW RGP (SEQ ID NO: 12). 

When protection was assessed by the survival or absence 
of lesions in the SC chamber model. Peptide B gave partial 

35 protection while the remaining treatments gave full protection 
(see Table 5 in Example 8) , 
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Humans (or other mammals) immunized with Arg-gingipains 
or Lys-gingipains and/or peptides having amino acid sequences 
derived from a low molecular weight Arg-gingipain or a HMW 
RGP, are protected from infection and invasion by P. 
5 gingivalis as assessed in this animal model. Preferably the 
hemagglutinin domain is not contained in the immunogenic 
composition. 

Female Balb/c mice were immunized with either RGP-1, RGP- 
2, or ^4AP- conjugated RGP-derived peptides by direct injection 

10 into stainless steel chambers implanted subcutaneous ly 

(Example B) , and subsequently challenged by injection of live 
P. gingivalis into chambers. Non-immunized animals or animals 
immunized with a scrambled peptide control and challenged with 
P. gingivalis developed ulcerated necrotic lesions on their 

15 abdomens, exhibited severe cachexia with ruffled hair, hunched 
bodies, and weight loss, with 14/22 and 5/8 deaths (Table 7) . 
In contrast, animals immunized with MAP-conjugated Peptide A, 
corresponding to the N-terminus of the catalytic domain of 
RGPs (Fig. 4) , followed by challenge with P. gingivalis were 

20 completely protected from abscess formation and death (Table 
7) . Similar results were obtained in animals that had been 
immunized with either whole P. gingivalis cells, RGP-1, or 
RGP-2. However, immunization with peptides corresponding to 
either a sequence encompassing the catalytic cysteine residue 

25 of RGPs (Peptide B) or an homologous sequence within the 

catalytic domains of RGPs and KGP (Peptide C) , followed by 
challenge with P. gingivalis, did not protect animals, nor did 
a peptide corresponding to the binding site within the 
adhesin/hemagglutinin domain of RGP-1 (Peptide D) Fig. 4, 

30 Table 1, SEQ ID NO: 14) which has been shown to be directly 
involved in the hemagglutinin activity of this gingipain 
[Curtiss et al- (1966) Infect. Iimun, M:2532] . Immunization 
with either peptide A, RGP-1, RGP-2, or P. gingivalis whole 
cells, followed by challenge with live bacteria resulted in a 

3 5 decrease in the number of mice from which this organism could 
be cultured (Table 8) , In contrast, P. gingivalis was readily 
cultured from chamber fluid obtained from 20/22 non- immunized 
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mice up to the time of death (Tcible 8) and from animals 
challenged after immunization with Peptides B, C, and D. In 
non-immunized animals P. gingivalis levels increased relative 
to the initial inoculum (10® to 10^^ CPU) throughout the course 
5 of the experiments (Table 3), while in animals immunized with 
Peptide A, RGP-1, RGP-2, or whole cells, P. gingivalis 
decreased in numbers (from 10* to <10*) . Taken together, these 
results indicate that itmnunization with a peptide 
corresponding to the N- terminal catalytic domain of RGPs can 

10 limit the ability of P. gingivalis to colonize and invade with 
the same efficiency as immunization with active proteinases or 
whole bacteria. 

Immunization with the N-terminal peptide of Arg-gingipain 
induced a moderate IgG response to RGP-1 and RGP-2 (Table 9) . 

15 The absence of a response to whole cells may be due to the 

lack of exposure of this epitope on cell surfaces so that the 
N-terminus of the membrane-associated RGP-1 catalytic domain 
is not available for antibody binding. The IgG response 
obtained following immunization with Peptide D, representing a 

20 portion of the adhesin/hemagglutinin domain of RGP-1, was 

comparable to that induced by the N- terminal peptide; however, 
protection against P. gingivalis challenge was not observed 
when this peptide was used as an immunogen (Tables 1 and 2) . 
Immunization with RGP-1 induced a high IgG titer to all 

25 antigens examined except for RGP-2 (Table 9) . The low titer to 
RGP-2 may be due to the absence of the highly immunogenic 
adhesin/hemagglutinin domain in this enzyme [Okamoto et al . 
(1996) J, Biochem, 12Il;398; Barkocy-Gallagher et al. (1996) J. 
Bacterial. I2a:2734] , Immunization with whole cells induced a 

30 good response to RGP-1 and KGP with essentially no binding to 
RGP-2. Postchallenge serum IgG titers were higher for all 
immunization groups when compared to the chamber fluid IgG 
titer 3 weeks post immunization, reflecting the effect of 
challenge with P. gingivalis. 

35 Competitive ELISA assays, using either RGP-1 or KGP as 

competing soluble antigens, indicated that 42% and 53% of the 
antibodies induced by immunization with heat-killed bacteria 

25 
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recognize RGP-1 and KGP, respeccively (Fig. 5) . However, even 
at very high concentrations, RGP-2 did not hinder IgG binding 
to P. gingivalis. These observations were also confirmed by 
Western blot analysis (Fig. 6D) and indicate that the non- 
5 catalytic hemagglutinin domains of RGP-l and KGP are 

responsible for approximately 50% of the induced IgG response, 
and as such, constitute major antigens of P. gingivalis , 
Chamber fluid from mice immunized with the N- terminal peptide 
of the catalytic domain of RGPs reacted with the 50 JcDa RGP-l, 

10 the catalytic domain of HMW RGP, with HMW RGP, with RGPs 

present in vesicles and bacterial membrane fractions, and with 
RGP-2 (Fig, SA) . A similar pattern was observed when chamber 
fluid from animals immunized with whole RGP-2 was utilized 
(Fig. 6E) . The lack of reactivity with KGP is in agreement 

15 with antibody-specificity results (Table 2) . Although the 

adhesin domain-derived peptide induced a poor IgG response as 
detected by ELISA, we found reactivity to several proteins by 
Western blot analysis (Fig. €C) . RGP-2 was not recognized by 
this smtibody due to the lack of an adhesin domain. However, 

20 reactivity could be detected with the 27 kDa domains of RGP-l 
and KGP and proteins migrating in the range of 6 0-70 kDa in 
vesicle and membrane preparations. Significantly, the adhesin 
domains present in the 44 kDa and 17 kDa subunits (Fig. 4) did 
not bind antibody. 

25 Immunization with RGP-l resulted in antibodies with 

specificity predominantly directed against the 44 kDa 
adhesin/hemagglutinin domain of RGP-l and the 43 kDa domain of 
KGP (Fig. 6B) . These domains were also recognized in vesicle 
and membrane preparations . Additional protein bands 

30 recognized by this cuitiserum included the 32 and 17 kDa 

proteins in KGP, as well as the equivalents in vesicles and 
membranes. However, the RGP-l catalytic domain was only 
weakly recognized, and RGP-2 not at all. These results are in 
agreement with previous studies in which the catalytic domains 

35 of RGPs were poorly recognized in antisera obtained from 

rabbits or chickens immunized with the entire RGP-l molecule. 
Immunization with heat -killed bacteria results in antibodies 
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(Fig. 6D) with specificities astonishingly similar to those 
induced by immunization with RGP-1. In addition to 
polypeptides composing the RGP-1 complex, high molecular 
weight proteins were also detected in vesicles and membranes. 
No reactivity was detected (Western blot analysis) for the 
catalytic domain of RGP-1 or RGP-2, results in agreement with 
those obtained with mice immunized with RGP-1 (Fig. 6B) and 
consistent with data obtained by ELISA in which antibodies 
generated following immunization with heat-killed P. 
gingivalis exhibited a very low titer against RGP-2. 

This study indicates that in mice the major IgG response 
is targeted to the adhesin/hemagglutinin domain of RGP-l. 
This is consistent with analysis of sera from patients with 
severe, untreated periodontitis. Such a specific response to 
the adhesin/hemagglutinin domain of gingipains mounted in 
human periodontitis patients appears to divert the immune 
response away from other protective antigens. In the mouse 
model, antibodies with this specificity can limit colonization 
and invasion of P. gingivalxs. However, in human subjects 
where the local inflammatory response leads to bone loss and 
destruction of the periodontal ligament, such antibodies can 
aggravate local tissue damage within the periodontal ligament. 
In this study, immunization of mice with a peptide 
corresponding to the N-terminus of RGPs generated a protective 
antibody response, but those antibodies did not recognize 
either RGP-1 or RGP-2 in cell preparations, indicating that 
this epitope (Fig. 4) is not exposed in whole cells. Rabbit 
antisera generated to the N- terminal portion of the catalytic 
domain of RGP-1 and RGP-2 also did not recognize RGP-1 in 
membranes or vesicle preparations unless samples were 
denatured by boiling, again suggesting that this epitope is 
not exposed in whole cells or vesicles. Inhibition of the 
maturation and/or catalytic activity of RGPs can inhibit 
invasion and colonization of P. gingivalis in mice and man. 
Such enzymes contribute to virulence in a multifactorial 
manner by influencing adherence to host tissues, activating 
cascade systems, degrading host proteins, and disturbing host 
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defenses. RGPs can act as processing proteinases responsible 
for self maturation and the maturation of KGP, fimbrillin, and 
a 75 kDa major cell surface protein. These latter proteins 
are required for full virulence of P. gingivalis fMaiek et al. 
5 (1994) J. Bacterial . 12£:1052; Goulbourne and Ellen (1991) J. 
Bacterioi. 122:5266; Lamont et al. (1994) Oral Microbiol. 
J/nmunoi. a:272; Lamont et al . (1992) Oral Microbiol. Immunol. 
2:1993; Hamada et al . (1994) Infect, Immxm. 1G9S ; Tokuda et 
al. (1996) Jnfect. Immun, M:4067] . 

10 Except as noted hereafter, standard techniques for 

peptide synthesis, cloning, DNA isolation, amplification and 
purification, for enzymatic reactions involving DNA ligase, 
DNA polymerase, restriction endonucleases and the like, and 
various separation techniques are those known and commonly 

15 employed by those skilled in the art. A number of standard 
techniques are described in Ausubel et al. (1994) Current 
Protocols in Molecular Biolory, Green Publishing, Inc., 
Sambrook et al. (1989) Molecular Cloniny . Second Edition, Cold 
Spring Harbor Laboratory, Plainview, New York; Maniatis et al . 

20 (1982) MQlfiCUlar Cloning/ cold Spring Harbor Laboratory, 

Plainview, New York; Wu (ed.) (1993) Meth. Enzymol. 2ia, Part 
I; Wu (ed.) (1979) Meth Enzymol. £&; Wu et al . (eds.) (1983) 
Meth. Enzymol. lOQ and JJH; Grossman and Moldave (eds.). Meth, 
Enzymol. Miller (ed.) (1972) Experiments in Molgjcular- 

25 Genetics, cold spring Harbor Laboratory, Cold Spring Harbor, 

New York, Old Primrose (1981) PrinriplP^ g of Gen^^ Manipulation . 
University of California Press, Berkeley; ■ Schleif and Wensink 
(1982) Practical Methods in MoT o cular R-iology r Glover (ed. ) 
(1985) DNA Cloning vol. I and II, IRL Press, Oxford, UK; Hames 

30 and Higgins (eds.) (1985) Nucleic Acid Hybridization , IRL 
Press, Oxford, UK; Setlow and Hollaender (1979) Geneti 
Engineering: Principles and Methnd.q^ Vols. 1-4, Plenui -ss, 
New York. Abbreviations and nomenclature, wh«re empl are 
deemed standard in the field and commonly used in proressional 

35 journals such as those cited herein. All references cited in 
this application are incorporated by reference in their 
entirety. 
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The foregoing discussion and the following examples 
illustrate but are not intended to limit the invention. The 
skilled artisan will understand that alternative methods can 
be used to implement the invention. 

5 Example 1, Purification of Ara-Gingi pains and Lys-n-i nyipains 
Bacterial Cultivation 

P. gingivalis strains HG66 (W83) and W50 (virulent) were 
used in these studies. Cells were grown in 500 ml of broth 
containing 15.0 g Trypticase Soy Broth (Difco, Detroit, 

10 Michigan), 2.5 g yeast extract, 2.5 mg hemin, 0.25 g cysteine, 
0.05 g dithiothreitol, 0.5 mg menadione (all from Sigma 
Chemical Company, St. Louis, MO) anaerobically at 37°C for 48 
hr in an atmosphere of 85% Nj, 10% COj, 5% Hj. The entire 500 
ml culture was used to inoculate 20 liters of the same medium, 

15 and the latter was incubated in a fermentation tank at 27^C 

for 48 hr (to a final optical density of 1.8 at 650 nm) . RGP- 
1 can also be purified as described for RGP-2. 

Proteinase Purification (RGP-l) 

1200 ml cell -free supernatant was obtained from the 48 hr 

20 culture by centrifugation at 18,000 x g for 30 min. at 4*>C. 
Proteins in the supernatant were precipitated out by 90% 
saturation with ammonium sulfate. After 2 hr at 4«C, the 
suspension was centrifuged at 18,000 x g for 30 min. The 
resulting pellet was dissolved in 0.05 M sodium acetate 

25 buffer, pH 4 . 5 , 0.15 NaCl, 5 mM CaClz; the solution was 

dialyzed against the same buffer overnight at 4*C, with three 
changes with a buffer: protein solution larger than 150:1. The 
dialysate was then centrifuged at 25,000 x g for '30 min and 
the dark brown supernatant (26 ml) was then chromatographed 

30 over an agarose gel filtration column (5.0 x 150 cm; Sephadex 
G-150, Pharmacia, Piscataway, NJ) which had been pre- 
equilibrated with the same buffer. The column was developed 
with said buffer at a flow rate of 36 ml/hr. 6 ml fractions 
were collected and assayed for both amidolytic and proteolytic 

35 activities, using Bz-L-Arg-pNA and azocasein as substrates. 

29 
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Four peaks containing amidolytic activity were identified. 
The fractions corresponding to peak 4 were combined, 
concentrated by ultrafiltration (Amicon PM-10 membrane; 
Amicon, Beverly, MA) and then dialyzed overnight against 0,05 
5 Bis-Tris, 5 mM CaClz, pH 6.0. The volume of the dialysate was 
14 ml. 

The 14 ml dialysate from the previous step was then 
applied to a DEAE-cellulose (Whatman, Maidstone, England) 
column (1 X 10 cm) equilibrated with 0.05 mM Bis*Tris, 5 mM 

10 CaClj, pH 6.0. The column was then washed with an additional 
100 ml of the same buffer. About 75% of the amidolytic 
activity, but only about 50% of the protein, passed through 
the column. The column wash fluid was dialyzed against 0.05 M 
sodium acetate buffer containing 5 mM CaClj (pH 4.5). This 19 

15 ml dialysate was applied to a Mono S FPLC column (Pharmacia 

IiKB Biotechnology Inc., Piscataway, NJ) equilibrated with the 
same buffer. The column was washed with the starting buffer 
at a flow rate of 1.0 ml/min for 20 min. Bound proteins were 
eluted first with a linear NaCl gradient (0 to 0.1 M) followed 

20 by a second linear NaCl gradient (0.1 to 0.25 M) , each 

gradient applied over a 25 min time period. Fractions were 
assayed for amidolytic activity using Bz-L-Arg-pNA. Fractions 
with activity were pooled and re-chromatographed using the 
same conditions. Although not detectable by gel 

25 electrophoresis, trace contamination by a proteinase capable 

of cleaving after lysyl residues was sometimes observed. This 
contaminating activity was readily removed by applying the 
sample to an arginine- agarose affinity column (L-Arginine- 
SEPHAROSE 4B) equilibrated with 0.025 M Tris-HCl, 5 mM CaCl^, 

30 0-15 M NaCl, pH 7.5, After washing with the same buffer, 

purified enzyme was eluted with 0.05 M sodium acetate buffer, 
5 mM CaCljr pH 4.5. Yields of gingipain-1 were markedly 
reduced by this step (about 60%) . 

RGP-1 can also be purified as described for RGP-2 with 

3 5 such appropriate modifications as are readily apparent to one 
of ordinary skill in the art. 
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Proteinase Purification (HMW RGP) 

The culture supernatant (2,900 ml) was obtained by 
centrifugation of the whole culture {6,000 x g, 30 min, 4**C) . 
Chilled acetone (4,350 ml) was added to this fraction over a 
5 period of 15 min, with the temperature of the solution 

maintained below 0**C at all times, using an ice/salt bath and 
this mixture was centrifuged (6,000 x g, 30 min, -15*^0 . The 
precipitate was dissolved in 290 ml of 20 mM Bis-Tris-HCl , 150 
mM NaCl, 5 mM CaClj, 0.02% (w/v) NaNj, pH 6.8 (Buffer A), and 

10 dialyzed against Buffer A containing 1.5 mM 4,4'- 

Dithiodipyridine disulfide for 4h, followed by 2 changes of 
buffer A overnight. The dialyzed fraction was centrifuged 
(27,000 X g, 30 min, 4®C) , following which it was concentrated 
to 40 ml by ultrafiltration using an Amicon PM-10 membrane. 

15 This concentrated fraction was applied to a Sephadex G-150 

column (5 x 115 cm 2260 ml; Pharmacia, Piscataway, NJ) which 
had previously been equilibrated with Buffer A, and the 
fractionation was carried out at 3 0 ml/h (1.5 cm/h) . 
Fractions (9 ml) were assayed for activity against Bz-L-Arg- 

20 pNa and Z-L-Lys-pNa (Novabiochem; 0.5 mM) . Amidolytic 

activities for Bz-L-Arg-pNa (0.5 mM) or Z-L-Lys-pNa were 
measured in 0 . 2 M Tris.Hcl, 1 mM CaClj, 0.02% (w/v) NaNa, 10 mM 
L-cysteine, pH 7.6. General proteolytic activity was measured 
with azocasein (2% w/v) as described by Barrett and Kirschke 

25 (1981) Meth. Enzyinol. M:535-561 for cathepsin L. Three peaks 
with activity against the two substrates were found. The 
first (highest molecular weight) peak of activity was pooled, 
concentrated to 60 ml using ultrafiltration and dialyzed 
overnight against two changes of 50 mM Tris-HCl, 1 mM CaClj, 

30 0.02% NaNj, pH 7.4 (Buffer B) . 

This high MW fraction was applied to an L-Arginine- 
Sepharose column (1 . 5 x 30 cm = 50 ml) , which had previously 
been equilibrated with Buffer B at a flow rate of 20 ml/hr 
(11.3 cm/h), following which the column was washed with two 

35 column volumes of Buffer B. Following this, a step gradient 
of 500 mM NaCl was applied in Buffer B and the column was 
washed with this concentration of NaCl until the A,flo baseline 
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fell to zero. After re -equilibration of the column in Buffer 
B, a gradient from 0-750 mM L-Lysine was applied in a total 
volume of 300 ml, followed by 100 ml of 750 mM L-Lysine. The 
column was once again re-equilibrated with Buffer B and a 
5 further gradient to 100 mM L-arginine in 300 ml was applied in 
the same way. Fractions (6 ml) from the Arg wash were assayed 
for activity against the two substrates as described 
previously. The arginine gradient eluted a major peak for an 
enzyme degrading Bz-L-Arg-pNa . The active fractions were 

10 pooled and dialyzed against two changes of 20 mM Bis-Tris-HCl , 
1 mM CaCla, 0.02% (v/w) NaNj, pH 6.4 (Buffer C) and 
concentrated down to 10 ml using an Amicon PM-10 membrane. 

The concentrate with activity for cleaving Bz-L-Arg-pNa 
was applied to a Monr Q FPLC column (Pharmacia LKB 

15 Biotechnology Inc, P ataway, NJ) equilibrated in Buffer C, 

the column was washe ith 5 column volumes of Buffer C at 1.0 
ml/min, following wh \ bound protein was eluted with a 3 step 
gradient [0-200 mM K 1 (10 min) , followed by 200-250 mM NaCl 
(15 min) and 250-500 mM NaCl (5 min) ] . The active fractions 

20 from Mono Q were pooled and used for further analyses. 

RGP-2 Purification 

Cells of P. gingivaiis (H66) were grown in 200 ml of 
brotr. containing 6.0 g of Trypticase Soy broth (Dif co) , 2.0 g 

25 of yeast extract, l mg of hemin, 200 mg of cysteine, 20 mg 
dithiothreitol and 0.5 mg of menadione (all from Sigma 
Chemical Co., St. Louis, MO) anaerobically, at 37*»C for 48 h 
in an atmosphere of 85% N2, 10% C02, 5% H2. The culture was 
used to inoculate 5 liters of the same broth, and incubated 

30 anaerobically, at 37<>C for about 48-60 h until the late 

stationary phase of bacteria growth (final optical density 
>2.0) . 

For purification of RGP-2, the initial steps of 
purification were performed according to the method design for 
35 94 kDa HMW RGP and high molecular weight lysine-specif ic 
gingipain (KGP) purification [Pike et al . (1994) J", Biol. 
Chem. 269 :4Q6-4ill . Briefly, the cell- free culture fluid was 

32 
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obtained by centrif ugation of the whole culture and chilled to 
-20°C. Acetone was slowly added to the chilled culture 
supernatant, with the temperature being maintained below 0**C. 
The precipitated protein was collected by centrif ugation, and 
5 the pellet was dissolved in 20 mM Bis-Tris, 150 mM NaCl, 0.02% 
NaN3 buffer (pH 6.8) containing 1.5 mM 4 , 4 * -dithiodipyridine 
disulfide (in a total volume equal to 1/20 of original culture 
supernatsmt subjected precipitation) and dialyzed first 
against the above buffer (one change) followed by two changes 

10 of the Bis-Tris/NaCl buffer supplemented with 5 mM CaCl2 but 
lacking 4 , 4 * -dithiodipyridine disulfide. The dialyzed protein 
solution was clarified by high speed centrif ugation (40,000 x 
g, 2h) , concentrated by ultrafiltration using an Amicon PM-10 
membrane (Amicon, Danvers, MA) , and the clarified solution was 

15 then applied to a gel filtration column (Sephadex G-150, 

Pharmacia, Piscataway NJ) equilibrated with Bis-Tris buffer. 
The column was developed at a flow rate of 30 ml/h, and three 
peaks with activity against Bz-L-Arg-pNA and Z-L-Lys-pNA were 
found. The highest molecular mass peak of activity against 

20 Bz-Ij-Arg-pNA/Z-L-Lys-pNA was used for the purification of 95 
kDa HMW RGP exactly as described by Pike et al, (1994) supra, 
while the lowest molecular mass peak having the majority of 
the activity against Bz-L-Arg-pNA was pooled, concentrated by 
ultrafiltration, and extensively dialyzed against several 

25 changes of 50 mM Bis-Tris, 1 mM CaC12, pH 6 . 5 and loaded at a 
flow rate 20 ml/h on anion exchange resin DE-52 Cellulose 
(Whatman) column (1.5 x 20 cm) equilibrated with 
Bis-Tris/CaC12 buffer. This column was washed until the Ajgontn 
base line fell to zero; then a gradient of 0-200 mM NaCl was 

30 applied in a total volume of 250 ml. Fractions (4 ml each) 

were assayed for activity against Bz-L-Arg-pNA. Some of this 
activity was found in the void volume (Vo) of the column, but 
the major peak was eluted at 100 mM NaCl concentration. 
Fractions from both peaks of activity were pooled, 

35 concentrated and dialyzed extensively either versus 50 mM 

sodium acetate buffer, 5 mM CaC12, pH 4 . 5 (Vo) or against 50 
mM Tris, 1 mM CaC12, pH 7.4 with 0.02% NaN3 (NaCl elute) . 
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From the Vo (run-through) of the DE-52 column, RGP-l was 
purified by means of HPLC on a Mono S column, followed by 
affinity chromatography over arginine-Sepharose 4B as 
described previously [Chen et al. (1991) supra]. The major 
5 activity peak eluted from DE-52 cellulose column with NaCl was 
applied to the arginine-Sepharose column (1.5 x 30 cm, 50 ml) 
equilibrated with Tris/CaClj buffer pH 7.4 at the flow rate of 
20 ml/h, following which the column was washed with buffer 
until activity against Bz-L-Arg-pNA fell below 20 mOD/min/ml, 

10 then a gradient to 100 mM L-arginine was applied in a volume 
of 300 ml. Three distinct peaks of activity obtained in this 
step, nonadsorbed, retarded and eluted with L-arginine, were 
concentrated, dialyzed against 3 changes of 50 mM sodium 
acetate buffer, 1 mM CaClj, pH 4.5 and applied to a Mono S FPLC 

15 column equilibrated with the same buffer at a flow rate of 1 
ml/rain. The column was washed with starting buffer and bound 
protein eluted using a linear NaCl gradient (0-0.15 M NaCl 
over 3 0 min time period) , Fractions in peaks containing 
activity were combined, dialyzed against 20 mM Bis-Tris, 150 

20 mM NaCl, 5 mM CaClj, pH 6 . 8 with NaNj and used for further 
analysis . 

Purification of Lys-Gingipain 

P. gingivalis strain EGGS (W83) was obtained from Roland 
Arnold (Emory University, Atlanta, GA) . Cells were grown in 

25 500 ml of broth contc^^ning 15.0 g Trypticase Soy Broth (Difco, 
Detroit, Michigan), 2.5 g yeast extract, 2.5 mg hemin, 0.25 g 
cysteine, 0.05 g dithiothreitol, 0.5 mg menadione (all from 
Sigma Chemical Company, St. Louis, MO) anaerobically at 37**C 
for 48 hr in an atmosphere of 85% Nj, 10% COj, 5% Hj. The 

30 entire 500 ml culture was used to inoculate 20 liters of the 
same medium, and the latter was incubated in a fermenta on 
tank at 3l°C tor 48 hr (to a final optical density of 1.-. at 
650 nm) . 

The culture supernatant (2,900 ml) was obtained by 
35 centrifugation of the whole culture {6,000 x g, 30 min, 4°C) . 
Chilled acetone (4,350 ml) was added to this fraction over a 
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period of 15 tnin, with the temperature of the solution 
maintained below O^C at all times, using an ice/salt bath to 
precipitate proteins. This mixture was centrifuged (6,000 x 
g, 30 min, -15®C) . The precipitate was dissolved in 290 ml of 
5 20 mM Bis-Tris-HCl, 150 mM NaCl, 5 mM CaClj, 0.02% (w/v) NaN,, 
pH 6.8 (Buffer A), and dialyzed against Buffer A containing 
1.5 mM 4, 4 • -Dithiodipyridine disulfide for 4h, followed by 2 
changes of Buffer A overnight. The dialyzed fraction was 
centrifuged (27,000 x g, 30 min, 4^0, following which the 

10 supernatant was concentrated to 40 ml by ultrafiltration using 
an Amicon PM-10 membrane. This concentrated fraction was 
applied to a Sephadex G-150 column (5 x 115 cm = 2260 ml; 
Pharmacia, Piscataway, NJ) which had previously been 
equilibrated with Buffer A, and the fractionation was carried 

15 out at 30 ml/h (1,5 cm/h) . Fractions (9 ml) were assayed for 
activity against Bz-L-Arg-pNa and Z-L-Lys-pNa (Novabiochem; 
0.5 mM) . Amidolytic activities for Bz-L-Arg-pNa (0.5 mM) or 
2-L-Lys-pNa were measured in 0.2 M Tris-HCl, 1 mM CaClj, 0.02% 
(w/v) NaNj, 10 mM L-cysteine, pH 7.6. Three peaks with 

20 activity against both pNA siibstrates were found. The highest 
molecular weight peak of activity contained most of the Z-L- 
Lys-pNA amidolytic activity. The fractions of the highest 
molecular weight peak of activity were pooled, concentrated to 
60 ml using ultrafiltration and dialyzed overnight against two 

25 changes of 50 mM Tris-HCl, 1 mM CaCla, 0.02% NaNj, pH 7.4 
(Buffer B) . 

This high MW fraction concentrate was applied to an L- 
Arginine-Sepharose column (1.5 x 30 cm = 50 ml), which had 
previously been equilibrated with Buffer B at a flow rate of 

30 20 ml/hr (11.3 cm/h), following which the column was washed 
with two column volumes of Buffer B. Following this, a step 
gradient of 500 mM NaCl was applied in Buffer B and the column 
was washed with this concentration of NaCl until the Ajao 
baseline fell to zero. After re-equilibration of the column 

35 with Buffer B, a linear gradient from 0-750 mM L-Lysine in 

Buffer B was applied in a total volume of 300 ml, followed by 
100 ml of Buffer B containing 750 mM L-Lysine. The column was 

35 
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once again re-equilibrated with Buffer B and a further 
gradient to 100 mM L-arginine in 300 ml was applied in the 
same way. Fractions (6 ml) from the Lys wash and from the Arg 
wash were assayed for activity against the two pNA substrates 
5 as described previously. The lysine gradient eluted a major 
peak of activity against Z-L-Lys-pNa only and the arginine 
gradient did the same for an enzyme degrading Bz-L-Arg-pNa . 
The active (for Z-L-Lys-pNA) fractions were pooled and 
dialyzed against two changes of 20 mM Bis-Tris-HCl, 1 mM CaCls, 

10 0.02% (w/v) NaNj, pH 6.4 {Buffer C) and the dialyzate was 
concentrated to 10 ml using Amicon PM-10 membranes. 

The dialyzate was applied to an anion exchange FPLC 
column (Mono Q FPLC column, Pharmacia LKB Biotechnology Inc., 
Piscataway, NJ) equilibrated in Buffer C, the column was 

15 washed with 5 column volumes of Buffer C at a flow rate of 1.0 
ral/min, following which bound protein was eluted with a 3 step 
gradient [0-200 mM NaCl (10 min) , followed by 200-275 mM NaCl 
(15 min) and 275-500 mM NaCl (5 min) , each in Buffer C. The 
active fractions from Mono Q chromatography were pooled. 

20 Example 2. Molecular Weight Determination 

The molecular weights of the purified Arg-gingipains and 
Lys-gingipains were estimated by gel filtration on a Superose 
12 column (Pharmacia, Piscataway, NJ) and by Tricine-SDS 
polyacrylamide gel electrophoresis. In the latter case, 1 mM 
25 TLCK was used to inactivate the protease prior to boiling, 
thus preventing autoproteolytic digestion. 

Example 3. Enzyme Assays 

Amidolytic activities of P. gingivalis proteinases were 
measured with the substrates MeO-Suc-Ala-Ala-Pro-Val-pNA at a 

30 concentration of 0.5 mM, Suc-Ala-Ala-Ala-pNA (0.5 mM) , Suc- 
Ala-Ala-Pro-Phe-pNA (0.5 mM) , Bz-Arg-pNA (1.0 mM) , Cbz-Phe- 
Leu-GlU'pNA) (0*2 mM) ; S-2238, S-2222, S-2288 and S-2251 each 
at a concentration of 0.05 mM; in 1.0 ml of 0.2 M Tris-HCl, 5 
mM CaCl2, pH 7.5. In some cases either 5 mM cysteine and/or 50 

35 mM glycyl -glycine (Gly-Gly) was also added to the reaction 
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mixture. Z-L-Lys-pNa (0.5 mM) in 0.2 M Tris-HCl, 0.02% (w/v) 
NaNa, 10 mM L-cysteine, was used for assay oaf Lys-gingipain. 

General proteolytic activity was assayed using the same 
buffer system as described for detecting amidolytic activity, 
but using azocoll or azocasein (2% w/v) as substrate as 
described for Cathepsin L by Barrett and Kirschke (1981), 
Meth, Enzymol. fifl., 53 5-561. 

For routine assays, pH optimum determination and 
measurement of the effect of stimulating agents and inhibitors 
on Arg-gingipains, only Bz-L-Arg-pNA was used as s\ibstrate. 
Potential inhibitory or stimulatory compounds were 
preincubated with enzyme for up to 20 min at room temperature 
at pH 7.5, in the presence of 5 mM CaCl^ (except when testing 
the effects of chelating agents) prior to the assay for enzyme 
activity. 

General proteolytic activity was assayed using the same 
buffer system as described for detecting amidolytic activity, 
but using azocoll or azocasein (1% w/v) as substrate. 

A unit of RGP enzymatic activity is based on the 
spectroscopic assay using benzoyl -Arg-p-nitroanilide as 
substrate and recording A absorbance units at 405 
nm/min/absorbance unit at 280 nm according to the method of 
Chen et al . (1992) supra. 

Example 4, Amino Acid gemifinge AnaTyflic 

Amino- terminal amino acid 'sequence analyses were carried 
out using an Applied Biosystems 4760A gas-phase sequenator, 
using the program designed by the manuf actxirer . 
Alternatively, amino acid sequences were deduced from the 
coding sequences of the corresponding coding sequences (see 
SEQ ID NO:l and SEQ ID NO: 3). The amino acid sequences of the 
COOH terminus of SDS^denatured RGP-1 and of the 50 kDa subunit 
of HMW RGP were determined. lo nmol aliquots of gingipain-1 
were digested in 0.2 M N-ethylmorpholine acetate buffer, pH 
8.0, with carboxypeptidase A and B at room temperature, using 
1:100 and 1:50 molar ratios, respectively. Samples were 
removed at intervals spanning 0 to 12 hours, boiled to 
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inactivate the carboxypeptidase, and protein was precipitated 
with 20% trichloracetic acid, Amino acid analyses were 
performed on the supematants. 

Example 5, Materials 

5 MeO-Suc-Ala-Ala-Pro-Val-pNA, Suc-Ala-Ala-Pro-Phe-pNA, 

Gly-Pro-pNA, Suc-Ala-Ala-Ala-pNA, Bz-Arg-pNA, 
di i sopropy 1 f 1 uor ophosphat e , pheny Ime t hy 1 su 1 f ony 1 f luor ide , 
tosyl-L- lysine chloromethyl ketone (TLCK) , tosyl-L- 
phenylalanine chloromethyl ketone (TPCK) , trans-epoxysuccinyl- 

10 L-leucylamide- (4 -guanidino) butane) , an inhibitor of cysteine 
proteinases, leupeptin, antipain and azocasein were obtained 
from Sigma Chemical Co., St. Louis, MO. 3,4- 
Dichloroisocoumarin was obtained from Boehringer, 
Indianapolis, IN and CBz-Phe-Leu-Glu-pNA and azocoll were 

15 obtained from Calbiochem, La Jolla, CA. S-2238 (D-Phe-Pip- 
Arg-pNA) , S-2222 (Bz-Ile-Glu- (y-OR) -Gly-Arg-pNA) , S-2288 (D- 
Ile-Pro-Arg-pNA) , and S-2251 (D-Val-Leu-Lys-pNA) were from 
Ke±ti-Vitrum, (Beaumont, Texas) . 

E xample 6. Rlectrophoresls 

20 SDS-PAGE was performed as in Laemmli (1970) Mature 

222:680-685. Prior to electrophoresis the samples were boiled 
in a buffer containing 20% glycerol, 4% SDS, and 0.1% 
bromophenol blue . The samples were run under reducing 
conditions by adding 2% 6-mercaptoethanol iinless othearwise 

25 noted. Samples were heated for 5 min at 100'*C prior to 

loading onto gels. A 5-15% gradient gel was used for the 
initial digests of C3 and C5. and the gels were subsequently 
stained with Coomassie Brilliant Blue R. The C5 digest uee-^ 
to visualize breakdown products before and after reductio: f. 

3 0 the disulfide bonds were electrophoresed in a 8% gel. 

Attempts to visualize C5a in the C5 digest were carried out 
using 13% gels that were developed with silver stain according 
to the method of Merril et al. (1979) Proc, Natl. Acad, Sci 
USA 2£: 4335-4340 . In some experiments (with HMW RGP) SDS-PAGE 
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using Tris-HCl/Tricine buffer was carried out per Shagger and 
Van Jagow (1987) Analyt. Biochem. l££:368-379. 

Kxampie 7. Coding Sequences for Arg-gingipains and Lys- 
gingipains 

5 A.DASH DNA libraries were constructed according to the 

protocols of Stratagene, using the lambda DASH™ II/BamHI 
cloning kit and DNA preparations from P, gingivalis strains 
HG66 (W83) and W50. A library of 3x10^ independent recombinant 
clones was obtained using P. gingivalis H66 DNA, and 1.5x10^ 

10 independent recombinant clones were obtained from virulent P. 
gingivalis W50 DNA. Screening and characterization of 
positive clones is described in U. S, Patents Nos. 5,323,390 
and 5,475,077. The coding and aiaino acid sequences of the 
polyprotein precursor of the HMW RGP is given in SEQ ID NO: 5. 

15 SEQ ID NO: 7 provides the Lys-gingipain coding sequence and SEQ 
ID NO: 8 the amino acid sequence. 

Example 8. Animal Model Stud j eg 

A mouse animal model [described in Genco et al. (1991) 
Infect. Xmmun. ii: 1255-1263] was used to study the protective 

20 effects of immunogenic compositions comprising P. gingivalis 
proteinases and/or peptides derived therefrom. 

Peptides for use as immunogens were synthesized using an 
implied Biosystems automated solid state process and the 
multi-lysine base according to the method of Tam, J. P. (1988) 

25 Proc. Natl. Acad. Sci. USA 5409-5413 and Posnett et al . 

(1988) J. Biol. Chem. 2£2: 1719-1725 . After purification, the 
peptides were suspended as described below. The multiple 
lysine base provides a framework for the simultaneous 
synthesis of multiple identical peptides and results in an 

30 "octopus" -like molecule which is antigenic without the need 

for conjugation to a carrier peptide. The multiple lysine base 
ia not itself antigenic. Thus, this technique offers some 
advantages over the previous peptide immunizations which 
required conjugation to carrier proteins such as keyhole 

35 limpet hemocyanin and bovine serum albumin. RGP-related 
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peptide sequences used in these experiments are provided 
below. 

Whole cell antigens for immunization were prepared by 
centrifugation of P. gxngivalis cultures for 10 rain at 10,000 
5 X g at room temperature and resuspension in 1/10 the original 
volume of anaerobic broth. Bacterial cells were heated to 
SS'^C for 10 min, and heat-treated preparations were plated on 
anaerobic blood agar and incubated for 7 days under anaerobic 
conditions to confirm effective killing. RGPs were purified 

10 from strain HG66 as described hereinabove. 

Mice were immunized by injection of each immunogen (50 
/ig/mouse in Freund's complete adjuvant) in sxibcutaneous 
chambers implanted in mice [Genco et al . (1992) Jnfect. Jjnmun. 
££1:1447] . Animals immunized with heat-killed P. ginglvalis 

15 received an initial immunization corresponding to 10® CFU. 
Control mice were immunized with Freund's adjuvant only. 

Female BALB/c mice about 8 weeks old are obtained from 
Sasco (Omaha, NE) or Charles River Laboratory (Wilmington, 
MA) . Coil-shaped subcutaneous (SC) chambers were prepared 

20 from 0.5 mm stainless steel wire and surgically implanted in 
the SC tissue of the dorsolumbar region of each mouse, with 
anaesthesia, A recovery period of at least 10 days is allowed 
before further treatment. During the 10 day period, the outer 
incision heals completely and the chambers become encapsulated 

25 by a thin vascularized layer of fibrous connective tissue amd 
gradually filled with approximately 0.5 ml of light-colored 
transudate . 
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After the 10 day recovery period, the mice are immunized 
according to the scheme in Table 1 : 



Table 1 



Group 


Immunogen 


Number of Mice 


A 


None 


6 


B 


50 kDa RGP-2 


6 


C 


Peptide B 


8 


D 


Peptide C 


8 


E 


Peptide A 


8 


F 


95 kDa HMW RGP 


8 


G 


Heat-killed 
P. gingival is 
A7436 whole cells 


8 



Stock solutions of immunogens were as follows: RGP-2, 
1,65 mg/ml in 20 mM Bis-Tris, 150 mM NaCl, 5 mM CaClj, 0.02% 
NaNj, pH 6.8 and diluted to 1 mg/ml for use in immunizations; 
15 Peptide B (SEQ ID NO: 11, QLPFIFDVACVNGDFLFSMPCFAEALMRAQ, 
catalytic domain of HMW RGP) , 1 mg/ml in cold NH^HCOj made 
fresh; Peptide C (SEQ ID NO: 12, 

GEPNPYQPVSNLTATTQGQKVTLKWDAPSTK, hemagglutinin domain of HMW 
RGP) 1 mg/ml in 10 mM acetic acid; Peptide A (SEQ ID NO: 10, 

20 YTPVEEKQNGRMIVIVAKKY, N-terminus of the HMW RGP catalytic 

subunit, 1 mg/ml in 10 mM acetic acid; RGP-2, 0.96 mg/ml in 20 
mM Bis-Tris, 150 mM NaCl, 5 mM CaCla, 0.02% NaN^, pH ^.8; and 
heat -killed whole P. gingivalis A7436 bacterial cells, lOVml. 
Group A mice (unimmunized controls) were inoculated with only 

25 Freund's complete adjuvant. Groups B-F were immunized with 50 
^g of MAP-peptides or protein in Freund's complete adjuvant 
per mouse in the primary immunizations injected into the 
chambers or SC. Groups B-F mice were given booster 
immunizations of 50 fig MAP-peptide twice a week for 5 weeks in 

30 Frexind's incomplete adjuvant. Group G mice were immunized by 
injecting the heat-killed whole bacterial cells into the 
chambers (without adjuvant) . 10® cells were injected into the 

41 



06/30/2003, EAST Version: 1.03.0002 



wo 97/34629 



PCT/US97/04d35 



chambers directly in the primary immunization; lO^ cells were 
injected in all booster immunizations. 

Mice are challenged with live P. gingivsilis A7436 (2 x 
10" colony forming units) five weeks after the initial 
5 immunization. The mice are observed daily for general 

appearance, primary and/or secondary abscess formation and 
health status. Chamber fluid is removed daily with a 
hypodermic needle and syringe for bacteriologic culture and 
microscopic examination. Fluid is also examined for the 

10 presence and activity of antibodies to the respective 

peptides. All surviving animals are sacrificed 30 days after 
inoculation, and the sera are separated from blood obtained by 
cardiac puncture. 

During the 10 day period the outer incision heals 

15 completely and the chambers become encapsulated by a thin 

vascularized layer of fibrous connective tissue and gradually 
filled with approximately 0.5 ml of light-colored transudate. 
Ten days after implantation, chambers are inoculated with 0.1 
ml of a suspension of P. gingivalis cells in prereduced 

20 Anaerobic Broth MIC (Difco Laboratories, Detroit, MI) . 

Control SC chambers were injected with Schaedler broth lacking 
bacterial cells. Mice were examined daily for size and 
consistency of primary or secondary lesions and for general 
appearance, primary and/or secondary abscess formation and 

25 health status. Severe cachexia is characterized by ruffled 
hair, hunched bodies and weight loss. Chamber fluid is 
aseptically removed from each implanted chamber with a 25 
gauge hypodermic needle and syringe at 1 to 7, and 14 days 
after inoculation for bacteriological culture and microscopic 

30 examination. All surviving animals are sacrificed at 30 days 
postinoculation and serum is separated from blood obtained by 
cardiac puncture. 

Aliquots of chamber fluid are streaked after live 
bacterial challenge for isolated microbial colonies on 

35 anaerobic blood agar plates (Remel, Lenexa, KS) and incubated 
for 7 days at 37°C under anaerobic conditions. P. gingivalis 
is then identified by standard techniques as described in 
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Hoiderman e-: al' :i9e4; "Anaerobic gram- negative scr=ight, 
curved and helical rods. Family 1. Bacceroidaceae r Pribram," 
In N.H. Krieg and J.G, Holt (ed.) Perggv Manual nf 
nprorr.ir = ^iv^ rgri-f>loqry . The Williams & Wilkins Cc . , 
Baltimore, MD, p. 602-631. Cultivable bacterial counts are 
obtained by serially diluting chamber fluid in Schaedler broth 
and spin plating onto anaerobic blood agar plates. 

Table 2 provides the results for recovery of P. 
gingivalis from the SC chambers at various times after 
challenges . 



Table 2 



P. gingivalis cultured from chamber fluid 



Group 


% of mouse 
cultured ( 


2 chambers from which P. gingivalis was 
3n given day postinoculation and CFU 
obtained from chambers 




1 


2 


4 


— i — 


A 


83% 
(1.8 X 10") 


66% 

(1,6 X 10") 


83% 

(1.1 X 

10") 


100% B 

(7,2 X 10") 


B 


33% 

(7,6 X 10") 


16% 

(4.7 X 10") 


16% 
(1.5 X 
10") 


0% 


C 


38% 

(8.4 X 10") 


38% 

(1.4 X 10") 


25% 

(1.1 X 

10") 


29% 

(1.9 X 10") 


D 


63% 
(7.3 X 10") 


75% 
(1.7 X 10") 


50% 
(6.8 X 
10") 


63% 
(2.2 X 10") 


£ 


38% 

(1.4 X 10") 


50% 

(4.7 X 10*) 


25% 

(4.0 X 10*) 


0 

(ND) 


F 


38% 

(ND) 


25% 
(ND) 


13% 
(ND) 


0 

IND) 


G 


13% 
(ND) 


0 

(ND) 


0 

(ND) 


0 

(ND) 



♦ KD TT.eans net detectable 
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Table 3 summarizes the results of the analysis of the 
pathological course of the P. gingivalis challenge in control 
and immunized animals. 

Table 3 

5 Pathological coixrse of P. gingivalis infection. 



Group 


% abdominal lesion 


% death 


A 


50% 


50% 


B 


0 


0 


C 


13% 


13% 


D 


0 


0 


E 


0 


0 


F 


0 


0 


G 


0 


0 



Specific immunoglobulin G (IgG) to P. gingivalis whole 
15 cells is quantitated from both chamber fluids auid sera for 
each group of mice, IgG specific for P. gingivalis whole 
cells is assayed by a modification of an enzyme- linked 
immunosorbent assay (ELISA) described by Ebersole et al. 
(1989) J, Dent, Rea, £1:286, abstract 1171. The results are 
20 read with a V^^ kinetic photometer (Molecular Devices Corp., 
Menlo Park, CA) at 450 nm. An aliquot of serum from each 
group of mice (inoculated with different strains of P, 
gingivalia) is pooled and used as a positive standard and run 
on each plate. 

25 Further protection experiments are performed to test the 

following peptides: RGP Catalytic domain Peptide B, 
QLPFIFDVACVNGDFLFSMPCFAEAIiMRAQ, SEQ ID NO: 11, MAP form; 
Scrambled catalytic domain, in both MAP and acid forms, 
DQANFLQCVGSLMCRLDFFFEAVMPIFPAA, SEQ ID NO: 13; N-terminal 

30 sequence of catalytic subunit of HMW RGP, Peptide A, MAP form, 
YTPVEEKQNGRMIVIAKKY, MAP form, SEQ ID NO: 10; Adhesin domain 
peptide (Peptide D) from adhesin/hemagglutinin domain of HMW 
RGP, in MAP and acid forms, GNHEYCVEVKYTAGVSPKVCKDVTV, SEQ ID 
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NO: 14; "Scrambled" adhesin domain peptide from HMW RGP, in MAP 
and acid forms, AHEKTYPVEDVNCSYVKTVCVGGKV, SEQ ID NO: 15. 

Peptides equivalent in amino acid sequence to portions of 
Arg-gingipains, including adhesin/hemagglutinin domains and/or 
5 catalytic proteins, have protective effects when used to 
immunize mice in the animal model described herein. 
"Scrambled" peptides do not confer protective immunity to 
subsequent challenge by live, infectious P. gingivalis. 

Additional peptides within the scope of the present 
10 invention include RMFMNYEPGRYTPVEEKQNG (SEQ ID NO: 16) which 
overlaps the activation site, TFAGFEDTYKRMFMNYEPGR (SEQ ID 
NO: 17) which is located some twenty amino acids upstream of 
the activation site, 

DYTYTVYRDGTKIKEGLTATTFEEDGVATGNMEYCVCVKYTAGVSPKVC (SEQ ID 
15 NO: 18), YTYTVYRDGTKIKEGLTATTFEEDG (SEQ ID NO: 19), 
RDGTKIKEGLTATTFEEDGVATGN (SEQ ID NO: 20) and 

KIKEGLTATTFEEDGVATGNHEY (SEQ ID NO: 21), all of which contain 

the FEED (SEQ ID NO:22) sequence which participates in 

fibronectin binding. Peptide KWDAPNGTPNPNPNPNPNPNPGTTTLSE 
20 (SEQ ID NO: 23) also can result in protective immunity after 

vaccination of a human or animal. 

A second immunization/challenge was carried out using 

Balb/C mice in the subcutaneous chamber model described above. 

Groups of eight mice per group were immunized by injection 
25 into the implanted subcutaneous chambers as set forth in Table 

4: 
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Table 4 



Grou'D 


X tiunuiio ^ 6 n 


iMUmoer or Wice 
8 


A 


None 


B 


50 kDa RGP-2 


8 


E 


Peptide D 


8 


F 


"Scrambled" Peptide D 


8 


G 


Peptide A 


8 


H 


Peptide A 


8 


I 


95 kDa RGP-1 


8 




heat-killed 
P. gingival is 
A7436 whole cells 


8 1 



Group A mice (negative controls) were injected with 
Freund's complete adjuvant only. Mice in groups E-H were each 
first injected with 50 ^g MAP-peptide in Freund's complete 
adjuvant; eight boosts each contained 50 ^g MAP-peptide in 
Freund's incomplete adjuvant. For groups E and F, boosts # 3 
and #6 were with free peptide. Groups B and F were treated as 
in the first experiment with eight boosts. Group J mice 
received heat -killed P. gingivalis A7436 cells without 
adjuvant (10* cells in primary injection, 10* cells per boost) . 

Each mouse was challenged by injection of 3 . 9 x 10^^ P. 
gingivalis A7436 into the subcutaneous chambers on the 32nd 
day after primary immunization. 
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Table 5 cresencs che resulcs fcr rscover^/ zz viable 
gingivalls cells iron; the subcucaneous chambers days 1, 2, 
3, 5 and 7 afcer challenge. 



Table 5 



1 


Recovery of P. gingival is from chamiDers 
following challenge 




Group 1 


% of 


mice £rorn wKic 
and ( CFU) Dc 


:h P. gingivalis was cultured 
iy Following Challenge 


" 


1 


2 


3 


5 


7 


A 


100% 

(2.1 X 10"* 


100% 

(1.6 X 10"* 


88% 

(1.1 X 
10") 


68% 

(6 X 10") 


88% 

(2.6 X 
10") 




88% 

(1.0 X 10") 


75% 

(2.1 X 10") 


63% 

(2.8 X 
10") 


75% 

(2 X 10") 


75% 

(2 X 10") 


C 


75% 

(1.6 X 10") 


50% 
(1.2 X 10") 


50% 

(6 X 10») 


50% 

(1.2 X 10*) 


50% 
(1.6 X 10«) 


D 


75% 

(2.1 X 10") 


75% 
CNF*) 


75% 

(W) 


75% 
<HF) 


75% 
(MF) 


E 


75% 

(2.4 X 10") 


63% 

(1 X 10") 


63% 
(4.5 X 10*) 


63% 

(2 X 10*) 


63% 
(MP) 


F 


63% 

(NP> 


63% 

(HP) 


50% 

(MP) 


50% 

(HF) 


50% 

(MF) 


G 


75% 

(6 X lO^M 


63% 

(1.5 X 10") 


63% 

(8 X 10») 


63% 

(5 X lOM 


63% 

(5 X lOM 


H 


75% 

(1.4 X 10") 


75* 

(NT) 


75% 

tH?) 


50% 

(NF) 


63% 

(NF) 


I 


88% 

(6 X 10") 


63% 

(MF) 


38% 

(NF) 


38% 

(KF) 


38% 

(MF) 


J 


100% 

1 (1.4 X 10") 


88% 

(1.7 X 10") 


100% 

(NF) 


88% 

(KF) 


88% 

(NF) 
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Table 6 summarizes the observations for pathological 
effects at 7 days after challenge. 



Table G 



Group 


Pathology observed following 
challenge with P. ging-ivalls 


% Lesions 


% Deaths 


Cachexia 


A 


38% 


38% 


+++++ 


B 


0 


0 


+ 


C 


0 


0 


++ 


D 


0 


0 


++++ 


E 


0 


13% 


++ 


F 


25% 


0 


++++ 


G 


0 


0 


+ 


H 


0 


0 


+ 


I 


0 


0 




J 


0 


0 


++ 



Cachexia scored on a scale from +++++ to - , with +++++ as 
severe and - as no cachexia. 

In further animal experiments, seven days post primary 
immxmization mice were boosted (lOx) at 3 day intervals with 
RGP-1, RGP-2, or MAP -conjugated peptides (50 Mg/mouse in 
Freund's incomplete adjuvant). Animals immunized with heat- 
killed P. gingivalis were boosted (lOx) at 3 day intervals 
with heat-killed P. gingivalis corresponding to 10^ CFU. At 
14, 21, and 28 days post immunization, chamber fluid was 
removed with a hypodermic needle and syringe, and IgG specific 
for RGP-1, RGP-2, KGP, and whole cells quantitated by an 
immunosorbent assay [Ebersole et al, (1984) J, Clin. 
Microbiol. 12:639]. Mice were challenged by inoculation of 10' 
CFU of P. gingivalis A7436 directly into chambers 49 days 
postimmunization and examined daily for size and consistency 
of lesions and health status. Severe cachexia was defined as 
ruffled hair, hunched bodies, and weight loss. Chamber fluid 
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was removed from each implanted chamber at i cc 7 days 
pcscchallenge for bacteriological culturing and immunciogicai 
analysis. All surviving animals were sacrificed 30 days 
poscchalienge, and sera were separated from blood obtained by 
cardia- puncture. 

Table 7 



Recovery of P. gingivalis from chamber fluid 
following challenge 



Group 


Total 
Mice 


Number of mice from which P. gingivalis 
was cultured and/total number of mice 
sampled on the following day 
post inoculation* 






1 


2 


5 


7 


Non- 

immunized 


22 


21/22 
(1.4xl0")*= 


20/22 

d.ixio") 


20/22 
(2.4x10") 




Peptide A 


32 


23/32 
(7.2x10") 


21/21 
(1,9X10") 


19/32 
(S.SxiO^) 


19/32 
(<10«) 


Scrambled 
peptide 


8 


8/8 

(6,7x10") 


8/8 

(4.8x10") 


7/8 
(2.0x10") 


7/8 
5.6X10*) 


Whole 
cells 


24 


17/24 
(7.4x10") 


11/27 
(8.8x10") 


9/24 
{4.6X10') 


6/24 

(<10M 


RGP-1 


24 


12/24 
(2x10") 


9/24 
(8x10*) 


4/24 
(<10«) 


3/24 
(<10^) 


RGP-2 


22 


15/22 
6.1x10") 


9/22 
(1,8x10") 


7/22 
(1.2x10") 


6/22 
(<10*) 



Aliquots of fluid from each chamber were streaked for 
isolation onto anaerobic blood agar plates and cultured 
at 37°C for 7 days under anerobic conditions. 



All animals in this group had died by day 7. 
Colony forming units obtained from chamber fluid. 
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Table 8 

Pathological course of F. gingivalis infection in 

immunized mice 



Grout) 


Total Mice 


Lesions* 


Deaths 


Cachexia^ 


Non- immunized 


22 


14/22 


14/22 




Peotiide A 


32 


1/32 


0/32 




Scrambled peptide 


8 


5/8 


5/8 


-»--♦-++ 


Whole cells 


24 


0/24 


0/24 


+ 


RGP-1 


22 


0/22 


0/22 




RGP-2 


24 


0/24 


0/22 





Number of mice with secondary lesion on the ventral 
abdomen/total of mice tested as detected on day 7 . 

Number of dead mice/total number of mice tested by day 7. 

Cachexia scored on a scale from to with as 

severe cachexia and as no cachexia. 



Additional animal experiments are carried out in a mouse 

periodontitis model as described by Oral infection is 

with P. gingivalis cells in carboxymethylcellulose by gavage. 
Where there is infection and resulting periodontal disease, 
there is measurable bone loss by the end of 6 weeks, P. 
gingivalis can be cultured from infected sites, and damage 
within the periodontal ligament can be assessed 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: UNIVERSITY OF GEORGIA RESEARCH FOUNDATION, INC. 
MOREHOUSE SCHOOL OF MEDICINE 
POTEMPA, JAN. 
TRAVTS, JAMES 
GENCO, CAROLINE A. 

(ii) TITLE OF INVENTION: IMMUNOGENIC COMPOSITIONS COMPRISING 
PORPHYROMONAS GINGIVALIS PROTEINS AND/OR PEPTIDES AND 
METHODS 

(iii) NUMBER OF SEQUENCES: 24 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Greenlee, Winner and Sullivan, P.C. 

(B) STREET: 5370 Manhattan Circle, Suite 201 

(C) CITY: Boulder 
<D) STATE: CO 

<E) COUNTRY: US 
(F) ZIP: 80303 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 21-MAR-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/013,945 

(B) FILING DATE: 22 -MAR- 1996 

<viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Ferber, Donna M. 

(B) REGISTRATION NUMBER: 33,87B 

(C) REFERENCE /DOCKET NUMBER: 103-95 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (303) 488-6080 

(B) TELEFAX: (303) 499-8089 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: xanJcnown 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: N- terminal 



(ix) FEATURE: 

(A) NAME/KEY: Region 
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(B> LCXJ^TION: 38.. 43 

(D) OTHER INFORMATION: /product = "Xaa" 
/ label =^ Xaa 

/notes "Xaa is used to denote an amino acid which could not be 
identified with certainty," 



(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

Tyr Thr Pro Val Glu Glu Lya Gin Asn Gly Arg Met lie Val lie Val 
15 10 15 

Ala Lys Lys Tyr Glu Gly Asp lie Lys Asp Phe Val Asp Trp Lys Asn 
20 25 30 

Gin Arg 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : not relevant 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: C-terminal 



(xi) SEQXmJCB DESCRIPTION: SEQ ID NO: 2: 

Glu Leu Leu Arg 
1 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3159 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Porphryomonas gingivalis 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 949.. 3159 



(xi) SEQUENCE DESC3HPTI0N: SEQ ID NO: 3: 
CTGCAGAGGG CTGGTAAAGA CCGCCTCG(3G ATCGAGGCCT TTGAGACGGG CACAAGCCGC 
CGCAQCCTCC TCTTCGAAGG TGTCTCGAAC GTCCACATCG GTGAATCCGT AGCAGTGCTC 
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ATTGCCATTG 


AGCAGCACCG 


AGGTGTGGCG 


CATCAGATAT 


ATTTTCATCA 


GTGGATTATT 


IBO 


AGGGTATCGG 


TCAGAAAAAG 


CCTTCCGAAT 


CCGACAAAGA 


TAGTAGAAAG 


AQAGTGCATC 


240 


TGAAAACA6A 


TCArrCGAGG 


ATTATCGATC 


AACTGAAAAG 


GCAGGAGTTG 


TTTTGCGTTT 


300 


TGGTTCGGAA 


AATTACCTGA 


TCAGCATTCG 


TAAAAACGTG 


GCGCGAGAAT 


TTTTTCGTTT 


360 


TGGCGCGAGA 


ATTAAAAATT 


TTTGGAACCA 


CAGCGAAAAA 


AATCTCGCGC 


CGTTTTCTCA 


420 


GGATTTACAG 


ACCACAATCC 


GAGCATTTTC 


GGTTCGTAAT 


TCATCGAAGA 


GACAGGTTTT 


480 


ACCGCATTGA 


AATCAGAGAG 


AGAATATCCG 


TAGTCCAACG 


GTTCATCCTT 


ATATCAGAGG 


540 


1TAAAAGATA 


TGGTACGCTC 


ATCGAGGAGC 


TGATTGGCTT 


AGTAGGTGAG 


ACTTTCTTAA 


600 


QAGACTATCG 


GCACCTACAG 


GAAGTTCATG 


GCACACAAGG 


CAAAGGAGGC 


AATCTTCGCA 


660 




ATATCAAAAG 


GATGAAACGA 




CGACAACCAA 


ATAGCCGTCT 


ion 


AC6GTAGACG 


AATGCAAACC 


CAATATGAGG 


CCATCAATCA 


ATCCGAATQA 


CAGCTTTTGa 


780 


GCAATATATT 


ATGCATATTT 


TGATTCGCGT 


TTAAAGGAAA 


AGTGCATATA 


TTTGCGATPG 


840 


TGGTATTTCT 


TTCGGTTTCT 


ATGTOAATTT 


TGTCTCCCAA 


GAAGACTTTA 


TAATGCATAA 


900 


ATACAGAAGG 


GGTACTACAC 


AGTAAAATCA 


TATTCTAATT 


TCATCAAA ATG AAA AAC 


957 



Met hys Asn 
1 

TTG AAC AAG TTT GTT TCG ATT GCT CTT TGC TCT TCC TTA TTA GGA GGA 1005 
Leu Asn Lys Phe Val Ser lie Ala Leu Cys Ser Ser Leu Leu Gly Qly 
5 10 15 

ATG GCA TTT GCG CAG GAG ACA GAG TTG GGA CGC AAT CCG AAT OTC AQA 1053 
Met Ala Phe Ala Gin Gin Thr Glu Leu Gly Arg Asn Pro Asn Val Arg 
20 25 30 35 

TTG CTC GAA TCC ACT CAG CAA TCG GTG ACA AAG GTT CAG TTC CGT ATG 1101 
Leu Leu Glu Ser Thr Gin Gin Ser Val Thr Lys Val Gin Phe Arg Met 
40 45 50 

GAC AAC CTC AAG TTC ACC GAA GTT CAA ACC CCT AAG GGA ATC GGA CAA 1149 
Asp Asn Leu Lys Phe Thr Glu Val Gin Thr Pro Lys Gly lie Gly Gin 
55 60 65 

GTG CCG ACC TAT ACA GAA GGG GTT AAT CTT TCC GAA AAA GGG ATG CCT 1197 
Val Pro Thr Tyr Thr Glu Gly Val Asn Leu Ser Glu Lys Gly Met Pro 
70 75 80 

ACG CTT CCC ATT CTA TCA CGC TCT TTG GCG GTT TCA GAC ACT CGT GAG 1245 
Thr Leu Pro lie Leu Ser Arg Ser Leu Ala Val Ser Asp Thr Arg Glu 
85 90 95 

ATG AAG GTA GAG GTT GTT TCC TCA AAG TTC ATC GAA AAG AAA AAT GTC 1293 
Met Lys Val Glu Val Val Ser Ser Lys Phe lie Glu Lys Lys Asn Val 
100 105 . 110 115 

CTG ATT GCA CCC TCC AAG GGC ATG ATT ATG CGT AAC GAA GAT CCG AAA 1341 
Leu lie Ala Pro Ser Lys Gly Met He Met Arg Asn Glu Asp Pro Lys 
120 125 130 

AAG ATC CCT TAG GTT TAT GGA AAG AGC TAC TCG CAA AAC AAA TTC TTC 138 9 

Lys He Pro Tyr Val Tyr Gly Lys Ser Tyr Ser Gin Asn Lys Phe Phe 
135 140 145 
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CCG GGA GAG ATC GCC ACQ CTT GAT GAT COT TTT ATC CTT CGT GAT QTG 1437 
Pr Gly Glu He Ala Thr Leu Asp Asp Pro Phe He Leu Arg Asp Val 
150 155 160 

CGT GGA CAG GTT GTA AAC TTT OCO CCT TTG GAG TAT AAC CCT GTG ACA 1485 
Arg Gly Gin Val Val Asn Phe Ala Pro Leu Gin Tyr Asn Pro Val Thr 
155 170 175 

AAG ACG TTG CGC ATC TAT ACG GAA ATC ACT GTG GCA GTG AGC GAA ACT 1533 
Lys Thr Leu Arg He Tyr Thr Glu He Thr Val Ala Val Ser Glu Thr 
180 185 190 195 

TCG GAA CAA GGC AAA AAT ATT CTG AAC AAG AAA GGT ACA TTT GCC GGC 1581 
Ser Glu Gin Gly Lys Asn He Leu Asn Lys Lys Gly Thr Phe Ala Gly 
200 205 210 

TTT GAA GAC ACA TAG AAG CGC ATG TTC ATG AAC TAG GAG CCG GGG CGT 1629 
Phe Glu Asp Thr Tyr Lys Arg Met Phe Met Asn Tyr Glu Pro Gly Arg 
215 220 225 

TAC ACA CCG GTA GAG GAA AAA CAA AAT GGT CGT ATG ATC GTC ATC GTA 1677 
Tyr Thr Pro Val Glu Glu Lys Gin Asn Gly Arg Met He Val He Val 
230 235 240 

GCC AAA AAG TAT GAG GGA GAT ATT AAA GAT TTC GTT GAT TGG AAA AAC 1725 
Ala Lya Lys Tyr Glu Gly Asp He Lys Asp Phe Val Asp Trp Lys Asn 
245 250 255 

CAA CGC GGT CTC CGT ACC GAG GTG AAA GTG GCA GAA GAT ATT GCT TCT 1773 
Gin Arg Gly Leu Arg Thr Glu Val Lys val Ala Glu Asp He Ala Ser 
260 265 270 275 

CCG GTT ACA GOT AAT GCT ATT CAG CAG TTC GTT AAG CAA GAA TAG GAG 1821 
Pro Val Thr Ala Asn Ala He Gin Gin Phe Val Lys Gin Glu Tyr Glu 
280 285 290 

AAA GAA GOT AAT GAT TTG ACC TAT GTT CTT TTG GTT GGC GAT CAC AAA 1869 
Lys Glu Gly Asn Asp Leu Thr Tyr Val hen Leu Val Gly Asp His Lys 
295 300 305 

GAT ATT CCT GCC AAA ATT ACT CCG GGG ATC AAA TCG GAC CAG GTA TAT 1917 
Asp He Pro Ala Lys He Thr Pro Gly He Lys Ser Asp Gin Val Tyr 
310 315 320 

GGA CAA ATA GTA GGT AAT GAC CAC TAC AAC GAA GTC TTC ATC GGT CGT 1965 
Gly Gin He Val Gly Asn Asp His Tyr Asn Glu Val Phe He Gly Arg 
325 330 335 

TTC TCA TGT GAG AGC AAA GAG GAT CTG AAG ACA CAA ATC GAT GGG ACT 2013 
Phe Ser Cys Glu Ser Lys Glu Asp Leu Lys Thr Gin He Asp Arg Thr 
340 345 350 355 

ATT CAC TAT GAG CGC AAT ATA ACC ACG GAA GAC AAA TGG CTC GGT CAG 2061 
He His Tyr Glu Arg Asn He Thr Thr Glu Asp Lys Trp Leu Gly Gin 
360 365 370 

GCT CTT TGT ATT GCT TCG GCT GAA GGA GGC CCA TCC GCA GAC AAT GGT 2109 
Ala Leu Cys He Ala Ser Ala Glu Gly Gly Pro Ser Ala Asp Asn Gly 
375 380 385 

GAA AGT GAT ATC GAG CAT GAG AAT GTA ATC GCC AAT CTG CTT ACC CAG 2157 
Glu Ser Asp He Gin His Glu Asn Val He Ala Asn Leu Leu Thr Gin 
390 395 400 

TAT GGC TAT ACC AAG ATT ATC AAA TGT TAT GAT CCG GGA GTA ACT CCT 2205 
Tyr Gly Tyr Thr Lys He He Lys Cys Tyr Asp Pro Gly Val Thr Pro 
405 410 415 
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AAA AAC ATT ATT GAT OCT TTC AAC GGA GGA ATC TCG TTG GTC AAC TAT 2253 
Lys Asn He He Asp Ala Phe Asn Gly Gly He Ser Leu Val Asn Tyr 
420 425 430 435 

ACQ GOC CAC GGT AGC GAA ACA GCT TGG GGT ACG TCT CAC TTC GGC ACC 2301 
Thr Gly His Gly Ser Glu Thr Ala Trp Gly Thr Ser His Phe Gly Thr 
440 445 450 

ACT CAT GTG AAG CAG CTT ACC AAC AGC AAC CAG CTA CCG TTT ATT TTC 2349 
Thr His Val Lys Gin Leu Thr Asn Ser Asn Gin Leu Pro Phe He Phe 
455 460 465 

GAC GTA GCT TGT GTG AAT GGC GAT TTC CTA TTC AGC ATG CCT TGC TTC 2397 
Asp Val Ala Cys Val Asn Gly Asp Phe Leu Phe Ser Met Pro Cys Phe 
470 475 480 

GCA GAA GCC CTG ATG CGT GCA CAA AAA GAT GGT AAG CCG ACA GGT ACT 2445 
Ala Glu Ala Leu Met Arg Ala Gin Lys Asp Gly Lys Pro Thr Gly Thr 
485 490 495 

GTT GCT ATC ATA GCG TCT ACG ATC AAC CAG TCT TGG GCT TCT CCT ATG 2493 
Val Ala He He Ala Ser Thr He Asn Gin Ser Trp Ala Ser Pro Met 
500 505 510 515 

CGC GGG CAG GAT GAG ATG AAC GAA ATT CTG TGC GAA AAA CAC CCG AAC 2541 
Arg Gly Gin Asp Glu Met Asn Glu He Leu Cys Glu Lys His Pro Asn 
520 525 530 

AAC ATC AAG CGT ACT TTC GGT GGT GTC ACC ATG AAC GGT ATG TTT GCT 2589 
Asn He Lys Arg Thr Phe Gly Gly Val Thr Met Asn Gly Met Phe Ala 
5S5 540 545 

ATG GTG GAA AAG TAT AAA AAG GAT GGT GAG AAG ATG CTC GAC ACA TGG 263 7 

Met Val Glu Lys Tyr Lys Lys Asp Gly Glu Lys Met Leu Asp Thr Trp 
550 555 560 

ACT GTT TTC GGC GAC CCC TCG CTG CTC GTT CGT ACA CTT GTC CCG ACC 2685 
Thr Val Phe Gly Asp Pro Ser Leu Leu Val Arg Thr Leu Val Pro Thr 
565 570 575 

AAA ATG CAG GTT ACG GCT CCG GCT CAG ATT AAT TTG ACG GAT GCT TCA 2733 
Lys Met Gin Val Thr Ala Pro Ala Gin He Asn Leu Thr Asp Ala Ser 
580 585 590 595 

GTC AAC GTA TCT TGC GAT TAT AAT GGT GCT ATT GCT ACC ATT TCA GCC 2 781 

Val Asn Val Ser Cys Asp Tyr Asn Gly Ala He Ala Thr He Ser Ala 
600 60S 610 

AAT GGA AAG ATG TTC GGT TCT GCA GTT GTC GAA AAT GGA ACA GCT ACA 2829 
Asn Gly Lys Met Phe Gly Ser Ala Val Val Glu Asn Gly Thr Ala Thr 
615 620 625 

ATC AAT CTG ACA GGT CTG ACA AAT GAA AGC ACG CTT ACC CTT ACA GTA 2877 
He Asn Leu Thr Gly Leu Thr Asn Glu Ser Thr Leu Thr Leu Thr Val 
630 635 640 

GTT GGT TAC AAC AAA GAG ACG GTT ATT AAG ACC ATC AAC ACT AAT GGT 2925 
Val Gly Tyr Asn Lys Glu Thr Val He Lys Thr He Asn Thr Asn Gly 
645 650 655 

GAG CCT AAC CCC TAC CAG CCC GTT TCC AAC TTG ACA GCT ACA ACG CAG 2973 
Glu Pro Asn Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala Thr Thr Gin 
660 665 670 675 

GGT CAG AAA GTA ACG CTC AAG TOG GAT GCA CCG AGC ACG AAA ACC AAT 3021 
Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Ser Thr Lys Thr Asn 
680 685 690 
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GCA ACC ACT AAT ACC GCT CGC AGC GTG GAT GGC ATA CGA GAA TTG GTT 
Ala Thr Thr Asa Thr Ala Arg Ser Val Asp Gly He Arg Glu Leu Val 
695 700 705 

CTT CTG TCA GTC AGC GAT GCC CCC OAA CTT CTT CGC AQC GGT CAG GCC 
Leu Leu Ser Val Ser Asp Ala Pro Glu Leu Leu Arg Ser Gly Gin Ala 
710 715 720 

GAG ATT GTT CTT GAA GCT CAC GAT GTT TGG AAT GAT GGA TCC 
Glu He Val Leu Glu Ala His Asp Val Trp Asn Asp Gly Ser 
725 730 735 



(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 737 amiao acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Lys Asn Leu Asn Lys Phe Val Ser He Ala Leu Cys Ser Ser Leu 
15 XO 15 

Leu Gly Gly Met Ala Phe Ala Gin Gin Thr Glu Leu Gly Arg Asn Pro 
20 25 30 

Asn Val Arg Leu Leu Glu Ser Thr Gin Gin Ser Val Thr Lys Val Gin 
35 40 45 

Phe Arg Met Asp Asn Leu Lys Phe Thr Glu Val Gin Thr Pro Lys Gly 
50 55 60 

He Gly Gin Val Pro Thr Tyr Thr Glu Gly Val Asn Leu Ser Glu Lys 
65 70 75 80 

Gly Met Pro Thr Leu Pro He Leu Ser Arg Ser Leu Ala Val Ser Asp 
85 90 95 

Thr Arg Glu Met Lys Val Glu Val Val Ser Ser Lys Phe He Glu Lys 
100 105 110 

Lys Asn Val Leu He Ala Pro Ser Lys Gly Met He Met Arg Asn Glu 
115 120 125 

Asp Pro Lys Lys He Pro Tyr Val Tyr Gly Lys Ser Tyr Ser Gin Asn 
130 135 140 

Lys Phe Phe Pro Gly Glu He Ala Thr Leu Asp Asp Pro Phe He Leu 

145 150 155 160 

Arg Asp Val Arg Gly Gin Val Val Asn Phe Ala Pro Leu Gin Tyr Asn 
165 170 175 

Pro Val Thr Lys Thr liCU Arg He Tyr Thr Glu He Thr Val Ala Val 
IBO 185 190 

Ser Glu Thr Ser Glu Gin Gly Lys Asn He Leu Asn Lys Lys Gly Thr 
195 200 205 

Phe Ala Gly Phe Glu Asp Thr Tyr Lys Arg Met Phe Met Asn Tyr Glu 
210 215 220 
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Pro Gly Arg Tyr Thr Pro Val Glu Glu Lys Gin Asn Gly Arg Met II 
225 230 235 240 

Val lie Val Ala Lys Lys Tyr Glu Gly Asp lie Lys Asp Phe val Asp 

245 250 255 

Trp Lys Asn Gin Arg Gly Leu Arg Thr Glu Val Lys Val Ala Glu Asp 
260 265 270 

lie Ala Ser Pro Val Thr Ala Asn Ala lie Gin Gin Phe Val Lys Gin 
275 280 285 

Glu Tyr Glu Lys Glu Gly Asn Asp Leu Thr Tyr Val Leu Leu Val Gly 
290 295 300 

Asp His Lys Asp lie Pro Ala Lys lie Thr Pro Gly He Lys Ser Asp 
305 310 315 320 

Gin Val Tyr Gly Gin He Val Gly Asn Asp His Tyr Asn Glu Val Phe 
325 330 335 

He Gly Arg Phe Ser Cys Glu Ser Lys Glu Asp Leu Lys Thr Gin He 
340 345 350 

Asp Arg Thr He His Tyr Glu Arg Asn He Thr Thr Glu Asp Lys Trp 
355 360 365 

Leu Gly Gin Ala Leu Cys He Ala Ser Ala Glu Gly Gly Pro Ser Ala 
370 375 380 

Asp Asn Gly Glu Ser Asp He Gin His Glu Asn Val He Ala Asn Leu 
385 390 395 400 

Leu Thr Gin Tyr Gly Tyr Thr Lys He He Lys Cys Tyr Asp Pro Gly 
405 410 415 

Val Thr Pro Lys Asn He He Asp Ala Phe Asn Gly Gly He Ser Leu 
420 42S 430 

Val Asn Tyr Thr Gly His Gly Ser Glu Thr Ala Trp Gly Thr Ser His 
435 440 445 

Phe Gly Thr Thr His Val Lys Gin Leu Thr Asn Ser Asn Gin Leu Pro 
450 455 460 

Phe He Phe Asp Val Ala Cys Val Asn Gly Asp Phe Leu Phe Ser Met 
465 470 475 480 

Pro Cys Phe Ala Glu Ala Leu Met Arg Ala Gin Lys Asp Gly Lys Pro 
485 490 495 

Thr Gly Thr Val Ala He He Ala Ser Thr He Asn Gin Ser Trp Ala 
500 505 510 

Ser Pro Met Arg Gly Gin Asp Glu Met Asn Glu He Leu Cys Glu Lys 
515 520 525 

His Pro Asn Asn He Lys Arg Thr Phe Gly Gly Val Thr Met Asn Gly 
530 535 540 

Met Phe Ala Met Val Glu Lys Tyr Lys Lys Asp Gly Glu Lys Met Leu 
545 550 555 560 

Asp Thr Trp Thr Val Phe Gly Asp Pro Ser Leu Leu Val Arg Thr Leu 
565 570 575 
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Val Pro Thr Lys Met Gin Val Thr Ala Pro Ala Gin He Asn Leu Thr 
580 58S 590 

Asp Ala Ser Val Asn Val Ser Cys Asp Tyr Asn Gly Ala He Ala Thr 
595 600 605 

He Ser Ala Asn Gly Lys Met Phe Gly Ser Ala Val Val Glu Asn Gly 
610 615 620 

Thr Ala Thr He Asn Leu Thr Gly Leu Thr Asn Glu Ser Thr Leu Thr 
625 630 635 640 

Leu Thr Val Val Gly Tyr Asn Lys Glu Thr Val He Lys Thr He Asn 
645 650 655 

Thr Asn Gly Glu Pro Asn Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala 
660 665 670 

Thr Thr Gin Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Ser Thr 
675 680 685 

Lys Thr Asn Ala Thr Thr Asn Thr Ala Arg Ser Val Asp Gly He Arg 
690 695 700 

Glu Leu Val Leu Leu Ser Val Ser Asp Ala Pro Glu Leu Leu Arg Ser 
705 710 715 720 

Gly Gin Ala Glu He Val Leu Glu Ala His Asp Val Trp Asn Asp Gly 
725 730 735 

Ser 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQtraJCE CHARACTERISTICS: 

(A) LENGTH: 7266 Jbase pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOTOCB: 

(A) ORGANISM: Porphyromonas gingivalis 

(ix) FEATURE: 

<A) NAME/KEY: CDS 

(B) LOCATION: 949.. 6063 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CTGCAGAGGG CTGGTAAAGA CCGCCTCGGG ATCGAGGCCT TTGAGAOGGG CACAAGCCGC 
CGCAGCCTCC TCTTCGAAGG TGTCTCGAAC GTCCACATCG GTGAATCCGT AGCAGTGCTC 
ATTGCCATTG A6CAGCACCG AGQTGTGGCG CATCAGATAT ATTTTCATCA GTGQATTATT 
AGGGTATCGG TCAGAAAAAG CCTTCCGAAT CCGACAAAGA TAGTAGAAAG AGAGTGCATC 
TGAAAACAGA TCATTCGAGG ATTATCGATC AACTQAAAAG GCAGGAGTTG TTTTGCGTTT 
TGGTTCGGAA AATTACCTGA TCAGCATTCG TAAAAACGTG GCGCGAGAAT TTTTTCGTTT 
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TGGCGCQAGA 


ATTAAAAATT 


TTTGGAACCA 


CAGCGAAAAA AATCTCGCGC 


CGTTTTCrCA 


GGATTTACAG 


ACCACAATCC 


GAGCATTTTC 


GGTTCGTAAT TCATCGAAGA 


GACAGGTTTT 


ACCGCATTQA 


AATCAGAGAG 


AGAATATCCG 


TAGTCXAACG GTTCATCCTT 


ATATCAGAGG 


TTAAAAGATA 


TGGTACGCTC 


ATCGAGGAGC 


TGATTGGCTT AGTAGGTGAG 


ACTTTCTTAA 


GAGACTATCG 


GCACCTACAG 


GAAGTTCATG 


GCACACAAGG CAAAGGAGGC 


AATCTTCGCA 


GACCGGACTC 


ATATCAAAAG 


GATGAAACGA 


CTTTTCCATA CGACAACCAA 


ATAGCCGTCT 


ACGGTAGACG 


AATGCAAACC 


CAATATGAGG 


CCATCAATCA ATCCGAATGA 


CAGCTTTTGG 


GCAATATATT 


AT6CATATTT 


TGATTCGCGT 


TTAAAGOAAA AGTGCATATA 


TTTGCGATTG 


TGGTATTTCT 


TTCGGTTTCr 


ATGTGAATTT 


TGTCTCCCAA GAAGACTTTA 


TAATGCATAA 


ATACAGAAGG 


GGTACTACAC 


AGTAAAATCA 


TATTCTAATT TCATCAAA ATG AAA AAC 



Met Lys Asn 
1 



420 
480 
540 
600 
660 
720 
780 
340 
900 
957 



TTG AAC AAG TTT GTT TCG ATT GCT CTT TGC TCT TCC TTA TTA GGA GGA 1005 
Leu Asn Lys Phe Val Ser lie Ala Leu Cys Ser Ser Leu Leu Gly Gly 
5 10 15 

ATG GCA TTT GCG CAG CAG ACA GAG TTG GGA CGC AAT CCG AAT GTC AGA 1053 
Met Ala Phe Ala Gin Gin Thr Glu Leu Gly Arg Asn Pro Asn Val Arg 
20 25 30 35 

TTG CTC GAA TCC ACT CAG CAA TCG GTG ACA AAG GTT CAG TTC CGT ATG 1101 
Leu Leu Glu Ser Thr Gin Gin Ser Val Thr Lys Val Gin Phe Arg Met 
40 45 50 

GAC AAC CTC AAG TTC ACC GAA GTT CAA ACC CCT AAG GGA ATC CSGA CAA 1149 
Asp Asn Leu Lys Phe Thr Glu Val Gin Thr Pro Lys Gly lie Gly Gin 
55 60 65 

GTG CCG ACC TAT ACA GAA GGG GTT AAT CTT TCC GAA AAA GGG ATG CCT 1197 
Val Pro Thr Tyr Thr Glu Gly Val Asn I*eu Ser Glu Lys Gly Met Pro 
70 75 80 

ACG CTT CCC ATT CTA TCA CGC TCT TTG GCG GTT TCA GAC ACT CGT GAG 1245 
Thr Leu Pro lie Leu Ser Arg Ser Leu Ala Val Ser Asp Thr Arg Glu 
95 90 95 

ATG AAG GTA GAG GTT GTT TCC TCA AAG TTC ATC GAA AAG AAA AAT GTC 1293 
Met Lys Val Glu Val Val Ser Ser Lys Phe lie Glu Lys Lys Asn Val 
lOO 105 110 lis 

CTG ATT GCA CCC TCC AAG GGC ATG ATT ATQ CGT AAC GAA GAT CCG AAA 1341 
Leu lie Ala Pro Ser Lys Gly Met lie Met Arg Asn Glu Asp Pro Lys 
120 125 130 

AAG ATC CCT TAC GTT TAT GGA AAG AGC TAC TCG CAA AAC AAA TTC TTC 1389 
Lys lie Pro Tyr Val Tyr Gly Lys Ser Tyr Ser Gin Asn Lys Phe Phe 
135 140 145 

CCG GGA GAG ATC GCC ACG CTT GAT GAT CCT TTT ATC CTT CGT GAT GTG 1437 
Pro Gly Glu lie Ala Thr Leu Asp Asp Pro Phe lie Leu Arg Asp Val 
150 155 160 

CGT GGA CAG GTT GTA AAC TTT GCG CCT TTG CAG TAT AAC CCT GTG ACA 1485 
Arg Gly Gin Val Val Asn Phe Ala Pro Leu Gin Tyr Asn Pro Val Thr 
165 170 175 
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AAG ACQ TTG CGC ATC TAT ACG GAA ATC ACT GTG GCA GTG AGC GAA ACT 1533 
Lys Thr Leu Arg lie Tyr Thr Glu lie Thr Val Ala Val Ser Glu Thr 
180 185 190 195 

TCG GAA CAA GGC AAA AAT ATT CTG AAC AAG AAA GGT ACA TTT GCC GGC 1581 
Ser Glu Gin Gly Lys Aan lie Leu Asn Lys Lys Gly Thr Phe Ala Gly 
200 205 210 

TTT GAA GAC ACA TAG AAG CGC ATG TTC ATG AAC TAC GAG CCG GGG CGT 1629 
Phe Glu Asp Thr Tyr Lys Arg Met Phe Met Asn Tyr Glu Pro Gly Arg 
215 220 225 

TAC ACA CCG GTA GAG GAA AAA CAA AAT GGT CGT ATG ATC GTC ATC GTA 1677 
Tyr Thr Pro val Glu Glu Lys Gin Asn Gly Arg Met He Val He Val 
230 235 240 

GCC AAA AAG TAT GAG GGA GAT ATT AAA GAT TTC GTT GAT TGG AAA AAC 1725 
Ala Lys Lys Tyr Glu Gly Asp He Lys Asp Phe Val Asp Trp Lys Asn 
245 250 255 

CAA CGC GGT CTC CGT ACC GAG GTG AAA GTG GCA GAA GAT ATT GCT TCT 1773 
Gin Arg Gly Leu Arg Thr Glu Val Lys Val Ala Glu Asp He Ala Ser 
260 265 270 275 

CCC GTT ACA GCT AAT GCT ATT CAG CAG TTC GTT AAG CAA GAA TAC GAG 1821 
Pro Val Thr Ala Asn Ala He Gin Gin Phe Val Lys Gin Glu Tyr Glu 
280 285 290 

AAA GAA GGT AAT GAT TTG ACC TAT GTT CTT TTG GTT GGC GAT CAC AAA 1869 
Lys Glu Gly Asn Asp Leu Thr Tyr Val Leu Leu Val Gly Asp His Lys 
295 300 305 

GAT ATT CCT GCC AAA ATT ACT CCG GGG ATC AAA TCC GAC CAG GTA TAT 1917 
Asp He Pro Ala Lys He Thr Pro Gly He Lys Ser Asp Gin Val Tyr 
310 315 320 

GGA CAA ATA GTA GGT AAT GAC CAC TAC AAC GAA GTC TTC ATC GGT CGT 1965 
Gly Gin He Val Gly Asn Asp His Tyr Asn Glu Val Phe He Gly Arg 
325 330 335 

TTC TCA TGT GAG AGC AAA GAG GAT CTG AAG ACA CAA ATC GAT CGG ACT 2013 
Phe Ser Cys Glu Ser Lys Glu Asp Leu Lys Thr Gin He Asp Arg Thr 
340 345 350 355 

ATT CAC TAT GAG CGC AAT ATA ACC ACG GAA GAC AAA TOG CTC GGT CAG 2061 
He His Tyr Glu Arg Asn He Thr Thr Glu Asp Lys Trp Leu Gly Gin 
360 365 370 

GCT CTT TGT ATT GCT TCG GCT GAA GGA GGC CCA TCC GCA GAC AAT GGT 2109 
Ala Leu Cys He Ala Ser Ala Glu Gly Gly Pro Ser Ala Asp Asn Gly 
375 380 385 

GAA AGT GAT ATC CAG CAT GAG AAT GTA ATC GCC AAT CTG CTT ACC CAG 2157 
Glu Ser Asp He Gin His Glu Asn Val He Ala Asn Leu Leu Thr Gin 
390 395 400 

TAT GGC TAT ACC AAG ATT ATC AAA TGT TAT GAT CCG GGA GTA ACT CCT 2205 
Tyr Gly Tyr Thr Lys He He Lys Cys Tyr Asp Pro Gly Val Thr Pro 
405 410 415 

AAA AAC ATT ATT GAT GCT TTC AAC GGA GGA ATC TCG TTG GTC AAC TAT 2253 
Lys Asn He He Asp Ala Phe Asn Gly Gly He Ser Leu Val Asn Tyr 
420 425 430 435 

ACG GGC CAC GGT AGC GAA ACA GCT TGG GGT ACG TCT CAC TTC GGC ACC 2301 
Thr Gly His Gly Ser Glu Thr Ala Trp Gly Thr Ser His Phe Gly Thr 
440 445 450 
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ACT CAT GTG AAG CAG CTT ACC AAC AGC AAC CAG CTA CCX3 TTT ATT TTC 2349 
Thr His Val Lys Gin Leu Thr Asn Ser Asn Gin Leu Pro Phe lie Phe 
455 460 465 

GAC GTA GCT TGT GTG AAT GGC GAT TTC CTA TTC AGC ATG CCT TGC TTC 2397 
Asp Val Ala Cys Val Asn Gly Asp Phe Leu Phe Ser Met Pro Cys Phe 

470 475 480 

GCA GAA GCC CTG ATG CGT GCA CAA AAA GAT GGT AAG CCG ACA GGT ACT 2445 
Ala Glu Ala Leu Met Arg Ala Gin Lys Asp Gly Lys Pro Thr Gly Thr 
485 490 495 

GTT GCT ATC ATA GCG TCT ACG ATC AAC CAG TCT TGG GCT TCT CCT ATG 2493 
Val Ala He He Ala Ser Thr He Asn Gin Ser Trp Ala Ser Pro Met 
500 505 510 515 

CQC GGG CAG GAT GAG ATG AAC GAA ATT CTG TGC GAA AAA CAC CCG AAC 2541 
Arg Gly Gin Asp Glu Met Asn Glu He Leu Cys Glu Lys His Pro Asn 
520 525 530 

AAC ATC AAG CGT ACT TTC GGT GGT GTC ACC ATG AAC GGT ATG TTT GCT 2589 
Asn He Lys Arg Thr Phe Gly Gly Val Thr Met Asn Gly Met Phe Ala 
535 540 545 

ATG GTG GAA AAG TAT AAA AAG GAT GGT GAG AAG ATG CTC GAC ACA TGG 2637 
Met Val Glu Lys Tyr Lys Lys Asp Gly Glu Lys Met Leu Asp Thr Trp 
550 555 560 

ACT GTT TTC GGC GAC CCC TCG CTG CTC GTT CGT ACA CTT GTC CCG ACC 2685 
Thr Val Phe Gly Asp Pro Ser Leu Leu Val Arg Thr Leu Val Pro Thr 
565 570 575 

AAA ATG CAG GTT ACG GCT CCG GCT CAG ATT AAT TTG ACG GAT GCT TCA 2733 
Lys Met Gin Val Thr Ala Pro Ala Gin He Asn Leu Thr Asp Ala Ser 
580 565 590 595 

GTC AAC GTA TCT TGC GAT TAT AAT GGT GCT ATT GCT ACC ATT TCA GCC 2781 
Val Asn Val Ser Cys Asp Tyr Asn Gly Ala He Ala Thr He Ser Ala 
600 605 610 

AAT GGA AAG ATG TTC GOT TCT GCA GTT GTC GAA AAT GGA ACA GCT ACA 2829 
Asn Gly Lys Met Phe Gly Ser Ala Val Val Glu Asn Gly Thr Ala Thr 
615 620 625 

ATC AAT CTG ACA GGT CTG ACA AAT GAA AGC ACG CTT ACC CTT ACA GTA 2877 
He Asn Leu Thr Gly Leu Thr Asn Glu Ser Thr Leu Thr Leu Thr Val 
630 635 640 

GTT GGT TAC AAC AAA GAG ACG GTT ATT AAG ACC ATC AAC ACT AAT GGT 2925 
Val Gly Tyr Asn Lys Glu Thr Val He Lys Thr He Asn Thr Asn Gly 
645 650 655 

GAG CCT AAC CCC TAC CAG CCC GTT TCC AAC TTG ACA GCT ACA ACG CAG 2973 
Glu Pro Asn Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala Thr Thr Gin 
660 665 670 675 

GGT CAG AAA GTA ACG CTC AAG TGG GAT GCA CCG AGC ACG AAA ACC AAT 3021 
Gly Gin Lys Vsil Thr I^u Lys Trp Asp Ala Pro Ser Thr Lys Thr Asn 
680 685 690 

GCA ACC ACT AAT ACC GCT CGC AGC GTG GAT GGC ATA CGA GAA TTG GTT 3 069 

Ala Thr Thr Asn Thr Ala Arg Ser Val Asp Gly He Arg Glu Leu Val 
695 700 705 

CTT CTG TCA GTC AGC GAT GCC CCC GAA CTT CTT CGC AGC GGT CAG GCC 3117 
Leu Leu Ser Val Ser Asp Ala Pro Glu Leu Leu Arg Ser Gly Gin Ala 
710 715 720 
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GAG ATT GTT CTT GAA OCT CAC GAT GTT TGG AAT GAT GGA TCC GOT TAT 3165 
Glu He Val Leu Glu Ala His Asp Val Trp Asn Asp Gly Ser Gly Tyr 
725 730 735 

CAG ATT CTT TTO GAT GCA GAC CAT GAT CAA TAT GGA CAG GTT ATA CCC 3213 
Gin He Leu Leu Asp Ala Asp His Asp Gin Tyr Gly Gin Val He Pro 
740 745 750 755 

AGT GAT ACC CAT ACT CTT TGG CCG AAC TGT AGT GTC COG GCC AAT CTG 3261 
Ser Asp Thr His Thr Leu Trp Pro Asn Cys Ser Val Pro Ala Asn Leu 
760 765 770 

TTC GCT CCG TTC GAA TAT ACT GTT CCG GAA AAT GCA GAT CCT TCT TGT 3309 
Phe Ala Pro Phe Glu Tyr Thr Val Pro Glu Asn Ala Asp Pro Ser Cys 
775 780 785 

TCC CCT ACC AAT ATG ATA ATG GAT GGT ACT GCA TCC GTT AAT ATA CCG 3357 
Ser Pro Thr Asn Met He Met Asp Gly Thr Ala Ser Val Asn He Pro 
790 795 800 

GCC GOA ACT TAT GAC TTT GCA ATT GCT GCT CCT CAA GCA AAT GCA AAG 3405 
Ala Gly Thr Tyr Asp Phe Ala He Ala Ala Pro Gin Ala Asn Ala Lys 
805 810 815 

ATT TGG ATT GCC GGA CAA GGA CCG ACG AAA GAA OAT GAT TAT GTA TTT 3453 
He Trp He Ala Gly Gin Gly Pro Thr Lys Glu Asp Asp Tyr Val Phe 
820 825 830 835 

GAA GCC GGT AAA AAA TAC CAT TTC CTT ATG AAG AAG ATG GGT AGC GGT 3501 
Glu Ala Gly Lye Lys Tyr His Phe Leu Met Lys Lys Met Gly Ser Gly 
840 845 B50 

GAT GGA ACT GAA TTG ACT ATA AGC GAA GOT GGT GGA AGC GAT TAC ACC 3549 
Asp Gly Thr Glu Leu Thr He Ser Glu Gly Gly Gly Ser Asp Tyr Thr 
855 860 865 

TAT ACT GTC TAT COT GAC GGC ACG AAG ATC AAG GAA GGT CTG ACG GCT 3597 
Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Ala 
870 875 880 

ACG ACA TTC GAA GAA GAC GGT GTA GCT ACG GGC AAT CAT GAG TAT TGC 3645 
Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys 
885 890 895 

GTG GAA GTT AAG TAC ACA GCC GGC GTA TCT CCG AAG GTA TGT AAA GAC 3693 
Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Lys Asp 
900 905 910 915 

GTT ACG GTA GAA GGA TCC AAT GAA TTT GCT CCT GTA CAG AAC CTG ACC 3741 
Val Thr Val Glu Gly Ser Asn Glu Phe Ala Pro Val Gin Asn Leu Thr 
920 925 930 

GGT AGT GCA GTC GGC CAG AAA GTA ACG CTC AAG TGG GAT GCA CCT AAT 3789 
Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Asn 
935 940 945 

GGT ACC CCG AAT CCA AAT CCG AAT CCG AAT CCG AAT CCC GGA ACA ACA 3837 
Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr 
950 955 960 

ACA CTT TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG 3885 
Thr Leu Ser Glu Ser Phe Glu Asn Gly He Pro Ala Ser Trp Lys Thr 
965 970 975 

ATC GAT GCA GAC GGT GAC GGG CAT GGC TGG AAG CCT GGA AAT GCT CCC 3933 
He Asp Ala Asp Gly Asp Gly His Gly Trp Lys Pro Gly Asn Ala Pro 
980 985 990 995 
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GGA ATC GCT GGC TAG AAT AGC AAT GGT TGT GTA TAT TCA GAG TCA TTC 3981 
Gly lie Ala Gly Tyr Asn Ser Asn Gly Cys Val Tyr Ser Glu Ser Phe 
1000 1005 1010 

GGT CTT GGT GGT ATA GGA GTT CTT ACC CCT GAC AAC TAT CTG ATA ACA 4029 
Gly Leu Gly Gly lie Gly Val Leu Thr Pro Asp Asn Tyr Leu lie Thr 
1015 1020 1025 

CCG GCA TTG GAT TTG CCT AAC GGA GGT AAG TTG ACT TTC TGG GTA TGC 4077 
Pro Ala Leu Asp Leu Pro Asn Gly Gly Lys Leu Thr Phe Trp Val Cys 
1030 1035 1040 

GCA CAG GAT GCT AAT TAT GCA TCC GAG CAC TAT GCG GTG TAT GCA TCT 4125 
Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
1045 1050 1055 

TCG ACC GGT AAC GAT GCA TCC AAC TTC ACG AAT GCT TTG TTG GAA GAG 4173 
Ser Thr Gly Asn Asp Ala Ser Asn Phe Thr Asn Ala Leu Leu Glu Glu 
1060 1065 1070 1075 

ACG ATT ACG GCA AAA GGT GTT CQC TCG CCG GAA GCT ATT CGT GGT CGT 4221 
Thr lie Thr Ala Lya Gly Val Arg Ser Pro Glu Ala lie Arg Gly Arg 
1080 1085 1090 

ATA CAG GGT ACT TGG CGC CAG AAG ACG GTA GAC CTT CCC GCA GGT ACG 4269 
He Gin Gly Thr Trp Arg Gin Lys Thr Val Asp Leu Pro Ala Gly Thr 
1095 1100 1105 

AAA TAT GTT GCT TTC CGT CAC TTC CAA AGC ACG GAT ATG TTC TAC ATC 4317 
Lys Tyr Val Ala Phe Arg His Phe Gin Ser Thr Asp Met Phe Tyr He 
1110 1115 1120 

GAC CTT GAT GAG GTT GAG ATC AAG GCC AAC GGC AAG CGC GCA GAC TTC 4365 
Asp Leu Asp Glu Val Glu He Lys Ala Asn Gly Lys Arg Ala Asp Phe 
1125 1130 1135 

ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCG GCG GAA TGG 4413 
Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp 
1140 1145 1150 1155 

ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT CTG TCT 4461 
Thr Thr He Asp Ala Asp Gly Asp Gly Gin Gly Trp I*eu Cys Leu Ser 
1160 X165 1170 

TCC GGA CAA TTG GAC TGG CTG ACA GCT CAT GGC GGC ACC AAC GTA GTA 4509 
Ser Gly Gin I*eu Asp Trp Leu Thr Ala His Gly Gly Thr Asn Val Val 
1175 1180 1185 

GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC TAT CTC 4557 
Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu 
1190 1195 1200 

ATC TCA AAG GAT GTT ACA GGC OCA ACG AAG GTA AAG TAC TAC TAT GCA 4605 
He Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala 
1205 1210 1215 

GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC TCC AAG 4653 
Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met He Ser Lys 
1220 1225 1230 1235 

ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA ACG CCT 4701 
Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro 
1240 1245 1250 

AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG GAA GCC 474 9 

Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala 
1255 1260 1265 
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AAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA GAT TTG 4797 
Asn Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val Asp Leu 
1270 1275 1280 

CCT GCG GGC ACG AAG TAT GTT GCT TTC CGT CAC TAG AAT TGC TCG GAT 4845 
Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp 
1285 1290 1295 

TTG AAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT GGC AGC 4893 
Leu Asn Tyr He Leu Leu Asp Asp lie Gin Phe Thr Met Gly Gly Ser 
1300 1305 1310 1315 

CCC ACC CCG ACC OAT TAT ACC TAC ACG GTG TAT CGT GAC GGT ACG AAG 4 941 

Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys 
1320 1325 1330 

ATC AAG GAA GGT CTQ ACC GAA ACG ACC TTC GAA GAA GAC GGC GTA GCT 4989 
He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala 
1335 1340 1345 

ACA GGC AAT CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC GGC GTA 5037 
Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val 
1350 1355 1360 

TCT CCG AAA GAG TGC GTA AAC GTA ACT ATT AAT CCG ACT CAG TTC AAT 5085 
Ser Pro Lys Glu Cys Val Asn Val Thr He Asn Pro Thr Gin Phe Asn 
1365 1370 1375 

CCT GTA AAG AAC CTO AAG GCA CAA CCG GAT GGC GGC GAC GTG GTT CTC 5133 
Pro Val Lys Asn Leu Lys Ala Gin Pro Asp Gly Gly Asp Val Val Leu 
1380 1385 1390 1395 

AAG TGG GAA GCC CCG AGC GCA AAA AAG ACA GAA GGT TCT CGT GAA GTA 5181 
Lys Trp Glu Ala Pro Ser Ala Lys Lys Thr Glu Gly Ser Arg Glu Val 
1400 1405 1410 

AAA CGG ATC GOA GAC GGT CTT TTC GTT ACG ATC GAA CCT GCA AAC GAT 5229 
Lys Arg He Oly Asp Gly Leu Phe Val Thr He Glu Pro Ala Asn Asp 
1415 1420 1425 

GTA CGT GCC AAC GAA GCC AAG GTT GTG CTC GCA GCA GAC AAC GTA TGG 5277 
Val Arg Ala Asn Glu Ala Lys Val Val Leu Ala Ala Asp Asn Val Trp 
1430 1435 1440 

GGA GAC AAT ACG GGT TAC CAG TTC TTG TTG GAT GCC GAT CAC AAT ACA 5325 
Gly Asp Asn Thr Gly Tyr Gin Phe Leu Leu Asp Ala Asp His Asn Thr 
1445 1450 1455 

TTC GGA AGT GTC ATT CCG GCA ACC GGT CCT CTC TTT ACC GGA ACA GCT 53 73 

Phe Gly Ser Val He Pro Ala Thr Gly Pro Leu Phe Thr Gly Thr Ala 
1460 1465 1470 1475 

TCT TCC AAT CTT TAC AGT GCG AAC TTC GAG TAT TTG ATC CCG GCC AAT 5421 
Ser Ser Asn Leu Tyr Ser Ala Asn Phe Glu Tyr Leu He Pro Ala Asn 
1480 1485 1490 

GCC GAT CCT GTT GTT ACT ACA CAG AAT ATT ATC GTT ACA GGA CAG GGT 546 9 

Ala Asp Pro Val Val Thr Thr Gin Asn He He Val Thr Gly Gin Gly 
1495 1500 1505 

GAA GTT GTA ATC CCC GOT GGT GTT TAC GAC TAT TGC ATT ACG AAC CCG 5517 
Glu Val val He Pro Gly Gly Val Tyr Asp Tyr Cys He Thr Asn Pro 
1510 1515 1520 

GAA CCT GCA TCC GGA AAG ATG TGG ATC GCA GGA GAT GGA GGC AAC CAG 5565 
Glu Pro Ala Ser Gly Lys Met Trp He Ala Gly Asp Gly Gly Asn Gin 
1525 1530 1535 



66 



SUBSTITUTE SHEET (RULE 26) 

06/30/2003, EAST Version: 1.03.0002 



wo 97/34629 



PCTAJS97/04635 



CCT GCA CGT TAT GAC GAT TTC ACA TTC GAA GCA GGC AAG AAG TAC ACC 5613 
Pro Ala Arg Tyr Aap Asp Phe Thr Phe Glu Ala Gly Lys Lye Tyr Thr 
1540 1545 1550 1555 

TTC ACG ATG CGT CGC GCC GGA ATG GGA GAT GGA ACT GAT ATG GAA GTC 5661 
Phe Thr Met Arg Arg Ala Gly Met Gly Asp Gly Thr Asp Met Glu Val 
1560 1565 1570 

GAA GAC GAT TCA CCT GCA AGC TAT ACC TAT ACA GTC TAT CGT GAC GGC 5709 
Glu Asp Asp Ser Pro Ala Ser Tyr Thr Tyr Thr Val Tyr Arg Asp Gly 
1575 1580 1585 

ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG ACC TAC CGC GAT GCA GGA 5757 
Thr Lys lie Lys Glu Gly Leu Thr Glu Thr Thr Tyr Arg Asp Ala Gly 
1590 1595 1600 

ATG AGT GCA CAA TCT CAT GAG TAT T6C GTA GAG GTT AAG TAC GCA GCC 5805 
Met Ser Ala Gin Ser His Glu Tyr Cys Val Glu Val Lys Tyr Ala Ala 
1605 1610 1615 

GGC GTA TCT CCQ AAG GTT TGT GTG GAT TAT ATT CCT GAC GGA GTG GCA 5853 
Gly Val Ser Pro Lys Val Cys Val Asp Tyr lie Pro Asp Gly Val Ala 
1620 1625 1630 1635 

GAC GTA ACG GCT CAG AAG CCT TAC ACG CTG ACA GTT GTT GGA AAG ACG 5901 
Asp Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val Gly Lys Thr 
1640 1645 1650 

ATC ACG GTA ACT TGC CAA GGC GAA GCT ATG ATC TAC GAC ATG AAC GGT 5949 
lie Thr Val Thr Cys Gin Gly Glu Ala Met He Tyr Asp Met Asn Gly 
1655 1660 1665 

CGT CGT CTG GCA GCC GGT CGC AAC ACA GTT GTT TAC ACG GCT CAG GGC 5997 
Arg Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr Ala Gin Gly 
1670 1675 1680 

GGC TAC TAT GCA GTC ATG GTT GTC GTT GAC GGC AAG TCT TAC GTA GAG 6045 
Gly Tyr Tyr Ala Val Met Val val Val Asp Gly Lys ser Tyr Val Glu 
1685 1690 1695 



AAA CTC GCT GTA AAG TAA TTCTGTCTTG GACTCGGAGA CTTTGTGCAG 
Lys Leu Ala Val Lys * 
1700 1705 


6093 


ACACTTTTAA 


TATAGGTCTG 


TAATTGTCTC 


AGAGTATGAA 


TCGATCGCCC 


GACCTCCTTT 


6153 


TAAGGAAGTC 


TGGGCGACTT 


CGTTTTTATG 


CCTATTATTC 


TAATATACTT 


CTGAAACAAT 


6213 


TTGTTCCAAA 


AA6TTGCATG 


AAAAQATTAT 


CTTACTATCT 


TTGCACTGCA 


AAAGGGGAGT 


6273 


TTCCTAAGGT 


TTTCCCCQGA 


GTAGTACGGT 


AATAACGGTG 


TGGTAGTTCA 


GCTGGTTAGA 


6333 


ATACCTGCCT 


GTCACGCAGG 


GGGTCGCGGG 


TTCGAGTCCC 


GTCCATACCG 


CTAAATAGCT 


6393 


GAAAGATAQG 


CTATAGGTCA 


TCTGAAGCAA 


TTTTAGAAAC 


GAATCCAAAA 


GCGTCTTAAT 


6453 


TCCAACGAAT 


TAAGGCGCTT 


TTTCTTTGTC 


GCCACCCCAC 


ACGTCGGATG 


AGGTTCGGAA 


6513 


TAGGCGTATA 


TTCCGTAAAT 


ATOCCTCCGG 


TGGTTCCATT 


TTGGTTACAA 


AAAACAAAGG 


6573 


GGCTQAAAAT 


TGTAACCACA 


GACGACGTTA 


AGACGATGTT 


TAGACGATTG 


ACAAATTACT 


6633 


CTGTTTCAAA 


ATCATATGTC 


GAACTTTGTA 


GCCGTATGGT 


TACACTAATT 


TTGGAGCAAA 


6693 


ATGAAGAGTC 


AATTTCGTTC 


AGTTTTTTAC 


TTGCGCAGCA 


ATTACATCAA 


CAAAGAAGGT 


6753 


AAAACTCCTG 


TCCTTATTCG 


TATTTATCTG 


AATAAGGAAC 


GCCTGTCGTT 


GGGTTCGACA 


6813 
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GGGCTGGCTG TTAATCCCAT ACAATGGGAT TCAGAAAAAG AGAAAGTCAA AGQACATAGT 6873 

GCAGAAGCAC TTGAAGTCAA TCGAAAGATC GAAGAAATCA GGGCTGATAT TCTGACCATT 6933 

TACAAACGTT TGGAAGTAAC AGTAGATGAT TTGACGCCGG AGAGGATCAA ATCGGAATAC 6993 

TGCGGACAGA CGGATACATT AAACAGTATA GTGGAACTTT TCGATAAACA TAACGAGGAT 7053 

GTCCGGGCCC AGGTGGGAAT CAATAAAACG GCTGCCACTT TACAAAAATA CGAAAACAGC 7113 

AAACGGCATT TTACCCGATT CCTCAAAGCG AAGTACAACA GAACGGATCT CAAATTCTCA 7173 

GAGCTTACCC CX3TTG0TCAT TCATAACTTT GAGATATATC TGCTGACTGT AGCCCATTGT 723 3 

TGCCCGAATA CGGCAACCAA AATCTTGAAG CTT 7266 

(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1705 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Lys Asn Leu Asn Lys Phe Val Ser lie Ala Leu Cya Ser Ser Leu 
15 10 15 

Leu Gly Gly Met Ala Phe Ala Gin Gin Thr Glu Leu Gly Arg Asn Pro 
20 25 30 

Asn Val Arg Leu Leu Glu Ser Thr Gin Gin Ser Val Thr Lys Val Gin 
35 40 45 

Phe Arg Met Asp Asn Leu Lys Phe Thr Glu Val Gin Thr Pro Lya Gly 
50 55 60 

lie Gly Gin Val Pro Thr Tyr Thr Glu Gly Val Asn Leu Ser Glu Lys 
65 70 75 BO 

Gly Met Pro Thr Leu Pro He Leu Ser Arg Ser Leu Ala Val Ser Asp 
85 90 95 

Thr Arg Glu Met Lys Val Glu Val Val Ser Ser Lys Phe He Glu Lys 
100 105 110 

Lys Asn Val Leu He Ala Pro Ser Lys Gly Met He Met Arg Asn Glu 
115 120 125 

Asp Pro Lys Lys He Pro Tyr Val Tyr Gly Lys Ser Tyr Ser Gin Asn 
130 135 140 

Lys Phe Phe Pro Gly Glu He Ala Thr Leu Asp Asp Pro Phe He Leu 
145 150 155 160 

Arg Asp Val Arg Gly Gin Val Val Asn Phe Ala Pro Leu Gin Tyr Asn 
165 170 175 

Pro Val Thr Lys Thr Leu Arg He Tyr Thr Glu He Thr Val Ala Val 
180 IBS 190 

Ser Glu Thr Ser Glu Gin Gly Lys Asn He Leu Asn Lys Lys Gly Thr 
195 200 205 
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Phe Ala Gly Phe Glu Asp Thr Tyr Lys Arg Met Phe Met Asn Tyr Glu 
210 215 220 

Pro Gly Arg Tyr Thr Pro Val Glu Glu Lys Gin Asn Gly Arg Met lie 
225 230 235 240 

Val lie Val Ala Lys Lys Tyr Glu Gly Asp lie Lys Asp Phe Val Asp 
245 250 255 

Trp Lys Asa Gin Arg Gly Leu Arg Thr Glu Val Lys Val Ala Glu Asp 
260 265 270 

He Ala Ser Pro Val Thr Ala Asn Ala He Gin Gin Phe Val Lys Gin 
275 280 2B5 

Glu Tyr Glu Lys Glu Gly Asn Asp Leu Thr Tyr Val Leu Leu Val Gly 
290 295 300 

Asp His Lys Asp He Pro Ala Lys lie Thr Pro Gly lie Lys Ser Asp 
305 310 315 320 

Gin Val Tyr Gly Gin He Val Gly Asn Asp His Tyr Asn Glu Val Phe 
325 330 335 

He Gly Arg Phe Ser Cys Glu Ser Lys Glu Asp Leu Lys Thr Gin He 
340 345 350 

Asp Arg Thr He His Tyr Glu Arg Asn He Thr Thr Glu Asp Lys Trp 
355 360 365 

Leu Gly Gin Ala Leu Cys He Ala Ser Ala Glu Gly Gly Pro Ser Ala 
370 375 380 

Asp Asn Gly Glu Ser Asp He Gin His Glu Asn Val He Ala Asn Leu 
385 390 395 40O 

Leu Thr Gin Tyr Gly Tyr Thr Lys He He Lys Cys Tyr Asp Pro Gly 
405 410 415 

Val Thr Pro Lys Asn He He Asp Ala Phe Asn Gly Gly He Ser Leu 
420 425 430 

Val Asn Tyr Thr Gly His Gly Ser Glu Thr Ala Trp Gly Thr Ser His 
435 440 445 

Phe Gly Thr Thr His Val Lys Gin Leu Thr Asn Ser Asn Gin Leu Pro 
450 455 460 

Phe He Phe Asp Val Ala Cys Val Asn Gly Asp Phe Leu Phe Ser Met 
465 470 475 480 

Pro Cys Phe Ala Glu Ala Leu Met Arg Ala Gin Lys Asp Gly Lys Pro 
485 490 495 

Thr Gly Thr Val Ala He He Ala Ser Thr He Asn Gin Ser Trp Ala 
500 505 510 

Ser Pro Met Arg Gly Gin Asp Glu Met Asn Glu He Leu Cys Glu Lys 
515 520 525 

His Pro Asn Asn He Lys Arg Thr Phe Gly Gly Val Thr Met Asn Gly 
530 535 540 

Met Phe Ala Met Val Glu Lys Tyr Lys Lys Asp Gly Glu Lys Met Leu 
545 550 555 560 
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Afip Thr Trp Thr Val Phe Gly Asp Pro Ser Leu Leu Val Arg Thr Leu 
565 570 575 

Val Pro Thr Lys Met Gin Val Thr Ala Pro Ala Gin He Asn Leu Thr 
580 585 590 

Asp Ala Ser Val Asn Val Ser Cys Asp Tyr Asn Gly Ala He Ala Thr 
595 600 605 

lie Ser Ala Asn Gly Lys Met Phe Gly Ser Ala Val Val Glu Asn Gly 
610 615 620 

Thr Ala Thr He Asn Leu Thr Gly Leu Thr Asn Glu Ser Thr Leu Thr 
625 630 635 640 

Leu Thr Val Val Gly Tyr Asn Lys Glu Thr Val He Lys Thr He Asn 
645 650 655 

Thr Asn Gly Glu Pro Asn Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala 
660 665 670 

Thr Thr Gin Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Ser Thr 
675 680 685 

Lys Thr Asn Ala Thr Thr Asn Thr Ala Arg Ser Val Asp Gly He Ara 
€90 695 700 

Glu Leu Val Leu Leu Ser Val Ser Asp Ala Pro Glu Leu Leu Arg Ser 
705 710 715 720 

Gly Gin Ala Glu He Val Leu Glu Ala His Asp Val Trp Asn Asp Gly 
725 730 735 

Ser Gly Tyr Gin He Leu Leu Asp Ala Asp His Asp Gin Tyr Gly Gin 
740 745 750 

Val He Pro Ser Asp Thr His Thr Leu Trp Pro Asn Cys Ser Val Pro 
755 760 765 

Ala Asn Leu Phe Ala Pro Phe Glu Tyr Thr Val Pro Glu Asn Ala Asp 
770 775 780 

Pro Ser Cys Ser Pro Thr Asn Met He Met Asp Gly Thr Ala Ser Val 
785 790 795 800 

Asn He Pro Ala Gly Thr Tyr Asp Phe Ala He Ala Ala Pro Gin Ala 
805 810 815 

Asn Ala Lys He Trp He Ala Gly Gin Gly Pro Thr Lys Glu Asp Asp 
820 825 830 

Tyr Val Phe Glu Ala Gly Lys Lys Tyr His Phe Leu Met Lys Lys Met 
835 840 845 

Gly Ser Gly Asp Gly Thr Glu Leu Thr He Ser Glu Gly Gly Gly Ser 
850 855 860 

Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly 
865 870 875 880 

Leu Thr Ala Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His 
885 890 895 

Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val 
900 905 910 
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Cys Lys Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala Pro Val Gin 
915 920 925 

Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp 
930 935 940 

Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Asn Pro 
945 950 955 960 

Gly Thr Thr Thr I*eu Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser 
965 970 975 

Trp Lys Thr He Asp Ala Asp Gly Asp Gly His Gly Trp Lys Pro Gly 
980 985 990 

Asn Ala Pro Gly He Ala Gly Tyr Asn Ser Asn Gly Cys Val Tyr Ser 
995 1000 1005 

Glu Ser Phe Gly Leu Gly Gly lie Gly Val Leu Thr Pro Asp Asn Tyr 
1010 1015 X020 

Leu He Thr Pro Ala Leu Asp Leu Pro Asn Gly Gly Lys Leu Thr Phe 
1025 1030 1035 1040 

Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val 
1045 1050 1055 

Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Thr Asn Ala Leu 
1060 1065 1070 

Leu Glu Glu Thr He Thr Ala Lys Gly Val Arg Ser Pro Glu Ala He 
1075 lOBO 1085 

Arg Gly Arg He Gin Gly Thr Trp Arg Gin Lys Thr Val Asp Leu Pro 
1090 1095 1100 

Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gin Ser Thr Asp Met 
1105 1110 1115 1120 

Phe Tyr He Asp Leu Asp Glu Val Glu He Lys Ala Asn Gly Lys Arg 
1125 1130 1135 

Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro 
1140 1145 1150 

Ala Glu Trp Thr Thr He Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu 
1155 1160 1165 

Cys Leu Ser Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr 
1170 1175 1X80 

Asn Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp 
1185 1190 1195 1200 

Asn Tyr Leu He Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr 
1205 1210 1215 

Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met 
1220 1225 1230 

He Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu 
1235 1240 1245 

Glu Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser 
1250 1255 1260 
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Tbr Glu Ala Asn Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr 
1265 1270 1275 1280 

Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn 
1285 1290 1295 

Cys Ser Asp Leu Asn Tyr lie Leu Leu Asp Asp lie Gin Phe Thr Met 
1300 1305 1310 

Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp 
1315 1320 1325 

Gly Thr Lys lie Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Qlu Asp 
1330 1335 1340 

Gly val Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr 
1345 1350 1355 1360 

Ala Gly Val Ser Pro Lys Glu Cys Val Asn Val Thr lie Asn Pro Thr 
1365 1370 1375 

Gin Phe Asn Pro Val Lys Asn Leu Lys Ala Gin Pro Asp Gly Gly Asp 
1380 1385 1390 

Val Val Leu Lys Trp Glu Ala Pro Ser Ala Lys Lys Thr Glu Gly Ser 
1395 1400 1405 

Arg Glu Val Lys Arg lie Gly Asp Gly Leu Phe Val Thr lie Glu Pro 
1410 1415 1420 

Ala Asn Asp Val Arg Ala Asn Glu Ala Lys Val val Leu Ala Ala Asp 
1425 1430 1435 1440 

Asn Val Trp Gly Asp Asn Thr Gly Tyr Gin Phe Leu Leu Asp Ala Asp 
1445 1450 1455 

His Asn Thr Phe Gly Ser Val lie Pro Ala Thr Gly Pro Leu Phe Thr 
1460 1465 1470 

Gly Thr Ala Ser Ser Asn Leu Tyr Ser Ala Asn Phe Glu Tyr Leu lie 
1475 1480 1485 

Pro Ala Asn Ala Asp Pro Val Val Thr Thr Gin Asn lie lie Val Thr 
1490 1495 1500 

Gly Gin Gly Glu Val Val lie Pro Gly Gly Val Tyr Asp Tyr Cys He 
1505 1510 1515 1520 

Thr Asn Pro Glu Pro Ala Ser Gly Lys Met Trp He Ala Gly Asp Gly 
1525 1530 1535 

Gly Asn Gin Pro Ala Arg Tyr Asp Asp Phe Thr Phe Glu Ala Gly Lys 
1540 1545 1550 

Lys Tyr Thr Phe Thr Met Arg Arg Ala Gly Met Gly Asp Gly Thr Asp 
1555 1560 1565 

Met Qlu Val Glu Asp Asp Ser Pro Ala Ser Tyr Thr Tyr Thr Val Tyr 
1570 1575 1580 

Arg Asp Gly Thr Lys lie Lys Glu Gly Leu Thr Glu Thr Thr Tyr Arg 
1585 1590 1595 1600 

Asp Ala Gly Met Ser Ala Gin Ser His Glu Tyr Cys Val Glu Val Lys 
1605 1610 1615 
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Tyr Ala Ala Gly Val Ser Pro Lys Val Cys Val Asp Tyr lie Pro Asp 
1620 1625 1630 

Gly Val Ala Asp Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val 
1635 1640 1645 

Gly Lys Thr He Thr Val Thr Cys Gin Gly Glu Ala Met He Tyr Asp 
1650 1655 1660 

Met Asn Gly Arg Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr 
1665 1670 1675 1680 

Ala Gin Gly Gly Tyr Tyr Ala Val Met Val Val Val Asp Gly Lys Ser 
1685 1690 1695 

Tyr Val Glu Lys Leu Ala Val Lys ♦ 

1700 1705 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 561 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1336.. 2862 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



CTGCAOAAGT 


TCACTCTTTC 


GCATATAGTG 


ACCCTCTTTT 


CTCTCAGCAT 


AATGGCACCT 


60 


ATCATATCAQ 


TAAGGG6CGT 


ATTGTCTTTT 


CGAACAATGT 


ACAGCCCGAG 


AACTCTTTAC 


120 


TTCCACATCA 


CACCCCCGAC 


TCCTTAGTCA 


AGGATCTTTT 


TTCCGCTTTC 


CCCTCCGCTC 


180 


TCTTCCTCAT 


GCTGGACTOA 


CTTAACCTTG 


GTCTGCTCTA 


CTTTTCGGTT 


GTAAATACAT 


240 


GCAACACAAT 


AAcn-rrn'A 


AGTGTTOTTA 


GACAACACTT 


TTACAAGACT 


CTGACTTTTA 


300 


ATGAGGT6GA 


GCATGAACCT 


TTTCCTCTTT 


CATCTTCTCC 


TTCAGATTAC 


AGTCAATATT 


360 


TTGGCAAAAG 


GCTAATTGAC 


AGCCTTTTAT 


AAGGOTTAAT 


CCCTTGTCGC 


TTATATTGAA 


420 


AACATGTTCT 


TTACGATCCG 


ATACTCTTCT 


TAAATCGAAA 


TTTTTCTCTA 


AATTGCGCCG 


480 


CAACAAAACT 


CCTTGAGAAA 


AGTACCAATA 


GAAATAGAA6 


GTAGCATTTT 


GCCTTTAAAT 


540 


TCCTTTTCTT 


TTCTTGGATT 


GTTCTTGAAA 


TGAATCTTAT 


TTCTGGATCT 


TTTTTGTTTT 


600 


TTTTAACCCG 


GCCGTGGTTC 


TCTGAATCAC 


GACCATAAAT 


TGTTTTAAAG 


TAT6AGGAAA 


660 


TTATTATTGC 


TGATCGCGGC 


QTCCCTTTTG 


GGAGTTGGTC 


TTTACGCCCA 


AAACGCCAAG 


720 


ATTAAGCTTG 


ATGCTCCGAC 


TACTCGAACG 


ACATGCACGA 


ACAATAGCTT 


CAAGCAGTTC 


780 


GATGCAAGCT 


TTTC6TTCAA 


TGAAGTCGAG 


CTGAGAAAGG 


TGGAGACCAA 


AGGTGGTACT 


840 


TTCGCCrCAG 


TGTCAATTCC 


GGGTGCATTC 


CC6ACCGGTG 


AGGTTGGTTC 


TCCCGAAGTG 


900 
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CCAGCAQTTA 


GGAAGTTGAT 


TGCTGTGCCT 


GTCX3AAGCCA 


GACCTGTTGT TCGCGTGAAA 


960 


AGTTTTACCG 


AGCAAGTTTA 


CTGTCTGAAC 


CAATAC6GTT 


CCGAAAAGCT CATGCCACAT 


1020 


CAACCCTCTA 


TOAOCAAGAG 


TGATGATCCC 


GAAAAGCTTC 


CCTTCGCTTA CAATGCTGCT 


1080 


GCTTATGCAC 


GCAAAGGTTT 


TGTCGGACAA 


GAACTGACCC 


AAGTAOAAAT GTTGGGGACA 


1140 


ATGCGTGGTG 


TTCGCATTGC 


AGCTCTTACC 


ATTAATCCTG 


TTCAGTATGA TGTAGTTGCA 


1200 


AACCAATTGA 


AGGTTAGAAA 


CAACATCGAA 


ATTGAAGTAA 


GCTTTCAGGG AGCTGATGAA 


1260 


GTAGCTACAC 


AACGTTTGTA 


TGATGCTTCT 


TTTAGCCCTT 


ATTTCGAAAC AGCTTATAAA 


1320 


CAGCTCTTCA 


ATAGA GAT GTT TAT ACA 
Asp Val Tyr Thr 
1 


GAT CAT GGC GAC TTG TAT AAT ACG 
Asp His Gly Asp Leu Tyr Asn Thr 
5 10 


1371 



CCG GTT CGT ATG CTT GTT GTT GCA GGT GCA AAA TTC AAA GAA GCT CTC 1419 
Pro Val Arg Met Leu Val Val Ala Gly Ala Lys Phe Lys Glu Ala Leu 
15 20 25 

AAG CCT TGG CTC ACT TGG AAG GCT CAA AAG GGC TTC TAT CTG GAT GTG 1467 
Lys Pro Trp Leu Thr Trp Lys Ala Gin Lys Gly Phe Tyr Leu Asp Val 
30 35 40 

CAT TAC ACA GAC GAA GCT GAA QTA QGA ACG ACA AAC GCC TCT ATC AAG 1515 
His Tyr Thr Asp Glu Ala Glu Val Gly Thr Thr Asn Ala Ser lie Lys 
45 50 55 60 

GCA TTT ATT CAC AAG AAA TAC AAT GAT GGA TTG GCA GCT ACT GCT GCT 1563 
Ala Phe He His Lys Lys Tyr Asn Asp Gly Leu Ala Ala Thr Ala Ala 
65 70 75 

CCG GTC TTC TTG GCT TTG GTT GGT GAC ACT GAC GTT ATT AGC GOA GAA 1611 
Pro Val Phe Leu Ala Leu Val Gly Asp Thr Asp Val He Ser Gly Glu 
80 85 90 

AAA GGA AAG AAA ACA AAA AAA GTT ACC GAC TTG TAT TAC ACT GCA GTC 1659 
Lys Gly Lys Lys Thr Lys Lys Val Thr Asp Leu Tyr Tyr Thr Ala Val 
95 100 105 

GAT GGC GAC TAT TTC CCT GAA ATG TAT ACT TTC CGT ATG TCT GCT TCT 1707 
Asp Gly Asp Tyr Phe Pro Glu Met Tyr Thr Phe Arg Met Ser Ala Ser 
110 115 120 

TCC CCA GAA GAA CTG ACG AAC ATC ATT GAT AAG GTA TTG ATG TAT GAA 1755 
Ser Pro Glu Glu Leu Thr Asn He He Asp Lys Val Leu Met Tyr Glu 
125 130 135 140 

AAG GCT ACT ATG CCG GAT AAG AGC TAT TTG GAA AAG OCC CTC TTG ATT 1803 
Lys Ala Thr Met Pro Asp Lys Ser Tyr Leu Glu Lys Ala Leu Leu He 
145 150 155 

GCC GGT GCT GAC TCC TAC TGG AAT CCT AAG ATA GGC CAG CAA ACC ATC 1851 
Ala Gly Ala Asp Ser Tyr Trp Asn Pro Lys He Gly Gin Gin Thr He 
160 165 170 

AAA TAT GCT GTA CAG TAT TAC TAC AAT CAA GAT CAT GGC TAT ACA GAT 1899 
Lys Tyr Ala Val Gin Tyr Tyr Tyr Asn Gin Asp His Gly Tyr Thr Asp 
175 180 185 

GTG TAC ACT TAC CCT AAA GCT CCT TAT ACA GGC TGC TAT AGT CAC TTG 194 7 

Val Tyr Thr Tyr Pro Lys Ala Pro Tyr Thr Gly Cys Tyr Ser His Leu 
190 195 200 
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AAT ACC GGT GTC GGC TTT GCC AAC TAT ACA GTG CAT GGA TCT GAG ACA 1995 
Asn Thr Gly Val Gly Phe Ala Asn Tyr Thr Val His Gly Ser Glu Thr 
205 210 215 220 

TCA TGG GCA GAT CCG TCC GTG ACC GCC ACT CAA GTG AAA GCA CTC ACA 2043 
Ser Trp Ala Asp Pro Ser Val Thr Ala Thr Gin Val Lys Ala Leu Thr 
225 230 235 

AAT AAG AAC AAA TAC TTC TTA GCT ATT GGG AAC TGC TGT GTT ACA GCT 2091 
Asn Lys Asn Lys Tyr Phe Leu Ala He Gly Asn Cys Cys Val Thr Ala 
240 245 250 

CAA TTC GAT TAT CCA CAO CCT TGC TTT GGA GAG GTA ATG ACT CGT GTC 2139 
Gin Phe Asp Tyr Pro Gin Pro Cys Phe Gly Glu Val Met Thr Arg Val 
255 260 265 

AAG GAG AAA GGT GCT TAT GCC TAT ATC GGT TCA TCT CCA AAT TCT TAT 2187 
Lys Glu Lys Gly Ala Tyr Ala Tyr He Gly Ser Ser Pro Asn Ser Tyr 
270 275 280 

TGG GGC GAG GAC TAC TAT TGG AGT GTG GGT GCT AAT GCA GTA TTT GGT 2235 
Trp Gly Glu Asp Tyr Tyr Trp Ser Val Gly Ala Asn Ala Val Phe Gly 
285 290 295 300 

GTT CAG CCT ACT TTT GAA GGT ACG TCT ATG GGT TCT TAT GAT GCT ACA 2283 
Val Gin Pro Thr Phe Glu Oly Thr Ser Met Gly Ser Tyr Asp Ala Thr 
305 310 315 

TTC TTG GAA GAT TCG TAC AAC ACA GTG AAC TCT ATT ATG TGG GCA GGT 2331 
Phe Leu Glu Asp Ser Tyr Asn Thr Val Asn Ser He Met Trp Ala Gly 
320 325 330 

AAT CTT GCT GCT ACT CAT GCC GAA AAT ATC GGC AAT GTT ACC CAT ATC 2379 
Asn Leu Ala Ala Thr His Ala Glu Asn He Gly Asn Val Thr His He 
335 340 345 

GOT OCT CAT TAC TAT TGG GAA GCT TAT CAT GTC CTT GGC GAT GGT TCG 2427 
Gly Ala His Tyr Tyr Trp Glu Ala Tyr His Val Leu Gly Asp Gly Ser 
350 355 360 

GTT ATG CCT TAT CGT GCA ATG CCT AAG ACC AAT ACT TAT ACG CTT CCT 2475 
Val Met Pro Tyr Arg Ala Met Pro Lys Thr Asn Thr Tyr Thr Leu Pro 
365 370 375 380 

GCT TCT CTG CCT CAG AAT CAG GCT TCT TAT AGC ATT CAG GCT TCT GCC 2523 
Ala Ser Leu Pro Gin Asn Gin Ala Ser Tyr Ser He Gin Ala Ser Ala 
385 390 395 

GGT TCT TAC GTA GCT ATT TCT AAA GAT GGA GTT TTG TAT GGA ACA GCT 2571 
Gly Ser Tyr Val Ala He Ser Lys Asp Gly Val Leu Tyr Gly Thr Gly 
400 405 410 

GTT GCT AAT GCC AGC GGT GTT GCG ACT GTG AAT ATG ACT AAG CAG ATT 2619 
Val Ala Asn Ala Ser Gly Val Ala Thr Val Asn Met Thr Lys Gin He 
415 420 425 

ACG GAA AAT GGT AAT TAT GAT GTA GTT ATC ACT CGC TCT AAT TAT CTT 2667 
Thr Glu Asn Gly Asn Tyr Asp Val Val He Thr Arg Ser Asn Tvr Leu 
430 435 440 

CCT GTG ATC AAG GAA ATT CAG GCA GGA GAG CCT AGC CCC TAC CAG CCT 2715 
Pro Val He Lye Glu He Gin Ala Gly Glu Pro Ser Pro Tyr Gin Pro 
445 450 455 460 

GTT TCC AAC TTG ACT GCT ACA ACG CAG GGT CAG AAA GTA ACG CTC AAG 2763 
Val Ser Asn Leu Thr Ala Thr Thr Gin Gly Gin Lys Val Thr Leu Lys 
465 470 475 
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TGG GAT GCC CCG AGC GCA AAG AAG GCA GAA 6GT TCC CGT GAA GTA AAA 2811 
Trp Asp Ala Pro Ser Ala Lys Lys Ala Glu Gly Scr Arg Glu Val hye 
480 485 490 

CGG ATC GGA GAC GGT CTT TTC GTT ACG ATC GAA CCT GCA AAC GAT GTA 2859 
Arg lie Gly Asp Gly Leu Phe Val Thr lie Glu Pro Ala Asn Asp Val 
495 500 505 

CGT GCCAACGAAG CCAAGGTTGT GCTCGCAGCA GACAACGTAT GGGGAGACAA 2912 
Arg 



TACGGGTTAC 


CAQTTCTTGT 


TGGATGCCGA 


TCACAATACA 


TTCGGAAGTG 


TCATTCCGGC 


2972 


AACCGGTCCT 


CTCTTTACOG 


GAAGAGCTTC 


TTCCAATCTT 


TACAGTGCGA 


ACTTCGAGTA 


3032 


TTTGATCCCG 


GCCAATQCCQ 


ATCCTGTTGT 


TACTACACAG 


AATATTATCG 


TTACAGGACA 


3092 


GGGTGAAGTT 


GTAATCCCCG 


GTGQTGTTTA 


CGACTATTGC 


ATTACGAAGC 


CGGAACCTGC 


3152 


ATCCGGAAAG 


ATGTGGATCG 


CAGGAGATGG 


AGGCAACCAG 


CCTGCACGTT 


ATQACGATTT 


3212 


CACATTCGAA 


GCAGGCAAGA 


AGTACACCTT 


CACGATGCGT 


CGCGCCGGAA 


TGGGAOATQG 


3272 


AACTGATATG 


GAAGTCGAAG 


ACGATTCACC 


TGCAAGCTAT 


ACCTACACGG 


TGTATCGTGA 


3332 


CGGCACGAAG 


ATCAAGGAAG 


GTCTGACGGC 


TACGACATTC 


GAAGAAGACG 


GTGTAGCTGC 


3392 


AGGCAATCAT 


GAGTATTQCG 


TGGAA6TTAA 


GTACACAGCC 


GGCGTATCTC 


CGAAGGTATG 


3452 


TAAAGACGTT 


ACGGTAGAAG 


GATCCAATGA 


ATTTGCTCCT 


GTACAGAACC 


TGACCGGTAG 


3512 


TGCAGTAGGT 


CAGAAAGTAA 


CGCTTAAGTG 


GGATGCACCT 


AATGGTACC 




3561 



(2) INFORMATION FOR SEQ ID NO:B: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 509 andno acida 

(B) TYPE: amino acid 
(D) TOPOLOOY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Asp Val Tyr Thr Asp His Gly Asp Leu Tyr Asn Thr Pro Val Arg Met 
15 10 15 

Leu Val Val Ala Gly Ala Lys Phe Lys Glu Ala Leu Lys Pro Trp Leu 
20 25 30 

Thr Trp Lys Ala Gin Lys Gly Phe Tyr Leu Asp Val His Tyr Thr Asp 
35 40 45 

Glu Ala Glu Val Gly Thr Thr Asn Ala Ser lie Lys Ala Phe He His 
50 55 60 

Lys Lys Tyr Asn Asp Gly Leu Ala Ala Thr Ala Ala Pro Val Phe Leu 
65 70 75 80 

Ala Leu Val Gly Asp Thr Asp Val He Ser Gly Glu Lys Gly Lys Lys 
85 90 95 

Thr Lys Lys Val Thr Asp Leu Tyr Tyr Thr Ala Val Asp Gly Asp Tyr 
100 105 110 
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Phe Pro Glu Met Tyr Thr Phe Arg Met Ser Ala Ser S r Pro Glu Glu 
1X5 120 125 

Leu Thr Asn lie lie Asp Lys Val Leu Met Tyr Glu Lys Ala Thr Met 
130 135 140 

Pro Asp Lys Ser Tyr Leu Glu Lys Ala Leu Leu lie Ala Gly Ala Asp 
145 150 X55 160 

Ser Tyr Trp Asn Pro Lys He Gly Gin Gin Thr He Lys Tyr Ala Val 
165 170 175 

Gin Tyr Tyr Tyr Asn Gin Asp His Gly Tyr Thr Asp Val Tyr Thr Tyr 
180 185 190 

Pro Lys Ala Pro Tyr Thr Gly Cys Tyr Ser His Leu Asn Thr Gly Val 
195 200 205 

Gly Phe Ala Asn Tyr Thr Val His Gly Ser Glu Thr Ser Trp Ala Asp 
210 215 220 

Pro Ser Val Thr Ala Thr Gin Val Lys Ala Leu Thr Asn Lys Asn Lys 
225 230 235 240 

Tyr Phe Leu Ala He Gly Asn Cys Cys Val Thr Ala Gin Phe Asp Tyr 
245 250 255 

Pro Gin Pro Cys Phe Gly Glu Val Met Thr Arg Val Lys Glu Lys Gly 
260 265 270 

Ala Tyr Ala Tyr He Gly Ser Ser Pro Asn Ser Tyr Trp Gly Glu Asp 
275 280 285 

Tyr Tyr Trp Ser Val Gly Ala Asn Ala Val Phe Gly Val Gin Pro Thr 
290 295 300 

Phe Glu Gly Thr Ser Met Oly Ser Tyr Asp Ala Thr Phe Leu Glu Asp 
305 310 315 320 

Ser Tyr Asn Thr Val Asn Ser He Met Trp Ala Gly Asn Leu Ala Ala 
325 330 335 

Thr His Ala Glu Asn He Gly Asn Val Thr His He Gly Ala His Tyr 
340 345 350 

Tyr Trp Glu Ala Tyr His Val Leu Gly Asp Gly Ser Val Met Pro Tyr 
355 360 365 

Arg Ala Met Pro Lys Thr Asn Thr Tyr Thr Leu Pro Ala Ser Leu Pro 
370 375 380 

Gin Asn Gin Ala Ser Tyr Ser He Gin Ala Ser Ala Gly Ser Tyr Val 
385 390 395 400 

Ala He Ser Lys Asp Gly Val Leu Tyr Gly Thr Gly val Ala Asn Ala 
405 410 415 

Ser Gly Val Ala Thr Val Asn Met Thr Lys Gin He Thr Glu Asn Gly 
420 425 430 

Asn Tyr Asp Val Val He Thr Arg Ser Asn Tyr Leu Pro Val He Lys 
435 440 445 

Glu He Gin Ala Gly Glu Pro Ser Pro Tyr Gin Pro Val Ser Asn Leu 
450 455 460 
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Tfar Ala Thr Thr Gin Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro 
465 470 475 480 

Ser Ala Lys Lys Ala Glu Gly Ser Arg Glu Val Lys Arg lie Gly Asp 
485 490 495 

Gly Leu Phe Val Thr lie Glu Pro Ala Asn Asp Val Arg 
500 505 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Lys He Lys Glu Gly Leu Thr 
15 10 X5 

Ala Thr Thr Glu Asp Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys 
20 25 30 

Val Glu Lys Tyr Thr Ala Gly Ser Val Ser Pro Lys Val Cys 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 eutiino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Tyr Thr Pro Val Glu Glu Lys Gin Asn Gly Arg Met lie Val lie Val 
15 10 15 

Ala Lys Lys Tyr 
20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Gin Leu Pro Phe lie Phe Asp Val Ala Cys Val Asn Gly Asp Phe Leu 
15 10 15 

Phe Ser Met Pro Cys Phe Ala Glu Ala Leu Met Arg Ala Gin 
20 25 30 

{2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C} STRANDEDNESS : single 
(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Gly Glu Pro Asn Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala Thr Thr 
15 10 15 

Gin Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Ser Thr Lys 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO _ 



(xi) SEQUENCE DESCTRIPTION: SEQ ID NO: 13: 

Asp Gin Ala Asn Phe Leu Gin Cys Val Gly Ser Leu Met Cys Arg Leu 
15 10 15 

Asp Phe Phe Phe Glu Ala Val Met Pro lie Phe Pro Ala Ala 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: p ptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Gly Asn His Glu Tyr Cys Val Qlu Val Lys Tyr Thr Ala Gly Val Ser 
15 10 15 

Pro Lys Val Cys Lys Asp Val Thr Val 
20 25 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Ala His Glu Lys Thr Tyr Pro Val Glu Asp Val Asn Cys Ser Tyr Val 
15 10 15 

Lys Thr Val Cys Val Gly Gly Lys Val 
20 25 

(2) INFORMATION FOR SEQ ID NOilS: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Arg Met Phe Met Asn Tyr Glu Pro Gly Arg Tyr Thr Pro Val Glu Glu 
15 10 15 

Lys Gin Asn Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOP0IXX3Y: not relevant 
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(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Thr Phe Ala Gly Phe Glu Asp Thr Tyr Lys Arg Met Phe Met Asn Tyr 
15 10 15 

Glu Pro Gly Arg 
20 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUmCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie Lys Glu Gly 
15 10 x5 

Leu Thr Ala Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn Met 
20 25 30 

Glu Tyr Cys Val Cys Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val 
35 40 45 

Cys 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu 
15 10 15 

Thr Ala Thr Thr Phe Glu Glu Asp Gly 
20 25 
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(2) INFORMATION FOR SEQ ID NO:20; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino aclds 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOUX3Y: not relevant 

(ii) MOLECOLE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Ala Thr Thr Phe Glu 
15 10 15 

Glu Asp Qly Val Ala Thr Gly Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Lys He Lys Glu Gly Leu Thr Ala Thr Thr Phe Glu Glu Asp Gly Val 
15 10 15 

Ala Thr Gly Asn His Glu Tyr 
20 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:22; 

Phe Glu Glu Asp 
1 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 
<B) TYPE: amino acid 
(C) STRANDEDNESS : single 
<D) TOPOLOGY: not relevant 

(li) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Lys Trp Asp Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn 
15 10 15 

Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
<iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Tyr Thr Pro Val Glu Glu Lys Glu Asn Gly Arg Met lie Val lie Val 

15 10 15 

Ala Lys Lys Tyr 
20 
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WE CLAIM: 

1. A vaccine composition comprising at least one 
oligopeptide, said oligopeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID 
N0:9, ID NO:10, SEQ ID N0:11, SEQ ID N0:12, SEQ ID N0:14, 
SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, 
SEQ ID NO:20, SEQ ID N0:21, SEQ ID NO:23 and SEQ ID 

NO: 24, and a physiologically acceptable diluent. 

2. The vaccine composition of claim 1 comprising at least 
one oligopeptide, comprising an amino acid sequence as 
given in SEQ ID: 10, and a phsiologically acceptable 
diluent. 

3. The use of at least one oligopeptide of less than about 
3 5 amino acids, said oligopeptide comprising an amino 
acid sequence selected from the group consisting of SEQ 
ID N0:9, ID NO:10, SEQ ID NO:ll, SEQ ID N0:12, SEQ ID 
N0:14, SEQ ID N0:16, SEQ ID N0:17, SEQ ID N0:18, SEQ ID 
N0:19, SEQ ID NO:20, SEQ ID N0;21, SEQ ID N0:23 and SEQ 
ID NO:24, in formulating a vaccine composition for 
protecting an animal or a human from infection by and/or 
periodontal disease caused by Porpphyromonas gingivalis, 

4. The use as in claim 3, wherein at least one oligopeptide 
comprises a sequence as given in SEQ ID NO: 10. 

5. A method for protecting an animal, including a human, 
from gingivitis and/or periodontal disease, said method 
comprising the step of administering to said animal or 
human the vaccine composition of claim 1. 

6. The method of claim 5 wherein said immunogenic 
composition comprises at least an oligopeptide, 
comprising an amino acid sequence as given in SEQ ID 
NO: 10. 

84 

SUBSTITUTE SHEET (RULE 26) 

06/30/2003, EAST Version: 1.03.0002 



wo 97/34629 



PCT/US97/04635 



7. The method of claim 5 wherein said immunogenic 
composition is administered via a route selected from the 
group consisting of subcutaneous injection, 
intraperitoneal administration, oral administration, and 
administration to a mucosal surface of the animal or 
human for which protection is sought. 

8. An oligopeptide of less than about 35 amino acids, said 
oligopeptide comprising an amino acid sequence selected 
from the group consisting of amino acid sequences as 
given in SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID 
NO:14, SEQ ID N0:16, SEQ ID N0:17, SEQ ID NO:18, SEQ ID 
N0:19, SEQ ID N0:20, SEQ ID NO:21, SEQ ID NO:23 and SEQ 
ID NO: 24. 
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